Seeking a Cloud Service Reliability Engineer in Chantilly, VA to support our Intelligence Community customer as part of a highly talented, highly motivated and high-performing team. As part of the Infrastructure O&M Support team you will be responsible for the availability, performance, monitoring, and incident response, among other things, of the Cloud Infrastructure that we support in a 24/7 environment.
- Ensure the uptime of our multi-tenant infrastructure
- Work closely with the engineering teams to improve our platforms and eliminate complexity from architecture and processes
- Configure and use state of the art monitoring tools to gather insights and then act upon the results
- Conduct incident response and in-depth root cause analysis.
- This position is hands-on, requiring the ability to provide first level system and network support and problem resolution identification.
- The candidate would be responsible for the monitoring the daily software and network operations in a distributed environment.
- Also responsible for monitoring, working with users on fault isolation and resolution, as well as system analysis and reporting.
- This job will include shift work to allow for complete 24×7 monitoring of software systems.
- You have at least an associate’s degree in Engineering or Computer Technology or Advanced Military Training.
- You have at least 2 years of Senior level relevant experience
- You have experience working with Windows and Linux operating systems.
- You have experience with distributed computing technologies.
- You have experience with virtualization technologies (e.g. OpenStack, Citrix XenServer Red Hat Enterprise Virtualization, and/or VMWare), Docker Containers, Ansible, and Heat templates.
- You have experience with front end processing and network gateway appliances and /or software.
- You have experience working in a customer environment and/or a classified environment.
- You have a background in supporting software and/or network operations with a clear understanding of networking fundamentals.
- You have experience with Linux/Unix and Windows operating systems.
- You hold a current CompTIA Security+, CASP or CISP certification. Computing Environment Certification (e.g. Linux+, RHCSA, RHCE, MCSA).
- You are able to effectively communicate both with customers and technical staff.
- You are willing to work in a 24/7 environment
- Have experience with infrastructure automation technologies including OpenStack, Ansible, Heat, Puppet, etc. Experience on Cloud Computing Fundamentals.
- Have a good understanding of KVM Virtualization technologies.
- Have previous experience with networking equipment.
- Have experience with Intelligence or DoD programs, either within the military or as a civilian contractor, is desired.
- Active TS/SCI with Full Scope Polygraph