|
PENDING CONTRACT AWARD
Mission Objectives - PAE Fires operates multiple separate networks consisting of ESXi Hosts, VMs/appliances, RedHat OpenShift, and VDI desktops requiring highly secure and available systems on a 24/7 basis. The Lead Senior Systems Engineer directs all server/virtualization/data center operations, backup and recovery, disaster recovery, and configuration management activities.
Position Responsibility Summary:
- Own the availability and performance of all server, virtualization, and storage infrastructure that approximately 3,500 users and mission-critical applications depend on daily; when systems go down, lead the recovery and take accountability for restoring service
- Architect and evolve the VMware VCF/SDDC environment to meet growing demand; make capacity planning decisions that balance cost, performance, and future scalability without over-provisioning
- Drive the adoption and maturation of containerized workloads on Red Hat OpenShift; establish deployment standards, manage the container lifecycle, and ensure the platform remains stable as development teams push new applications into production
- Design and validate backup and recovery strategies that actually work under pressure; regularly test restores, validate RPO/RTO compliance, and ensure the team can reconstitute the full IT environment from the COOP site if required
- Maintain deep technical mastery across both Windows Server and Red Hat Enterprise Linux; troubleshoot complex cross-platform issues that intermediate administrators cannot resolve and serve as the final escalation point for the team
- Build and maintain infrastructure-as-code practices using Ansible and Chef to ensure consistent, repeatable, and auditable system configurations across hundreds of VMs and servers
- Think like a defender: harden all systems to DISA STIG standards not because a checklist says so, but because you understand the threat landscape and know that a misconfigured server is an open door
- Develop your team; grow intermediate engineers into senior-level performers through mentoring, knowledge sharing, and progressively challenging assignments; create internal documentation and runbooks that reduce single points of knowledge failure
- Coordinate closely with the Network and Cybersecurity leads to ensure infrastructure changes are planned holistically; a server migration impacts network routing, firewall rules, and security monitoring simultaneously
- Plan and execute complex maintenance windows with zero or minimal mission impact; communicate clearly to stakeholders about what is happening, why, and what the fallback plan is if something goes wrong
|