Automation Engineer (DevOps), Change Manager, Major Incident Coordinator
06/03/2024 - Present
I joined VIB as a DevOps engineer within the Data Core division, responsible for designing, building and maintaining critical infrastructure to support high-performance scientific research. My role combines platform engineering and site reliability responsibilities, including designing a new secure compute infrastructure for Secure Data Processing, and helping the organisation implement and align with NIS2 standards through both system design and governance process development.
The Context#
VIB is a world-leading life sciences research institute based in Flanders, Belgium. The Data Core team provides the compute, storage and platform infrastructure that underpins research across multiple VIB centres. I relocated to Belgium in 2024, and while the move was a personal decision, the role has given me the opportunity to apply my process design experience in an entirely new domain – European biotechnology research infrastructure.
Although my title is Automation Engineer, the scope of my role extends well beyond engineering. I chair the Change Advisory Board, coordinate Major Incidents, coach team members, and have authored governance processes from scratch – responsibilities that draw directly on my team leadership experience at SiXworks and UKCloud.
Governance & Process Design#
Since starting with Data Core in 2024, I have designed and implemented the following policies, procedures and working practices in accordance with NIS2 standards:
- Change Management – authored the process, procedure and policy; chairing the Change Advisory Board
- Major Incident Management – authored the process, procedure and policy; serving as Major Incident Coordinator during service outages, leading impact assessments and resolution efforts
- Solution Development – establishing a structured approach to solution design and delivery
Achievements#
- Authored and implemented the Change Management and Major Incident Management processes, procedures and policies
- Designed and deployed the Data Core SIEM stack, integrating Wazuh, OpenSearch, Graylog and MongoDB for enhanced monitoring and compliance
- Implemented a highly available HashiCorp Vault solution, configured to:
- Act as an intermediary certificate authority with automated certificate issuing and renewal using Certbot
- Provide S3 storage encryption in conjunction with MinIO
- Built a highly available Docker Swarm cluster with dedicated environments for development and customer testing
- Developed service endpoint monitoring and alerting to support active incident response and feed into the Major Incident process
- Developed a sensitive data storage and compute platform for handling Human Identifiable Data securely
- Automated virtual infrastructure deployment using Infrastructure as Code principles, reducing VM deployment to two commands:
- <90 seconds for provisioning any number of virtual machines
- <5 minutes for fully configured, updated and ready-to-use VMs
Responsibilities#
- Manage, maintain and improve existing infrastructure using expertise as an Automation Engineer and Site Reliability Engineer
- Identify and automate manual processes to introduce input validation, error checking, logging and reporting
- Provision virtual infrastructure for internal and adjacent teams to support their daily functions
- Enhance and maintain the Configuration Management Database (NetBox) as the authoritative source of truth
- Act as a Subject Matter Expert, providing technical authority for code reviews and risk analysis to prioritise user experience, uptime and reliability
- Serve as Change Manager and Chair of the Change Advisory Board, championing and updating processes aligned with ITIL best practices
- Function as Major Incident Coordinator during service outages, leading impact assessments and resolution efforts
- Leverage ISO27001 and ITIL expertise to implement processes and procedures that meet emerging NIS2 requirements
- Triage daily service issues and coach team members in Linux, automation and IT best practices
- Manage the onboarding of researchers to VIB Data Core by provisioning object storage and account services
- Develop and maintain the VIB Data Core storage infrastructure
Technology#
- Automation: Ansible, Terraform
- Containerisation: Docker, Docker Swarm
- Virtualisation: VMware
- Infrastructure: Linux (RHEL), HashiCorp Vault, MinIO, NetBox
- SIEM: Wazuh, OpenSearch/ElasticSearch, Graylog
- Databases: MongoDB, PostgreSQL
- Monitoring: Uptime Kuma
- Source Control: Gitea
- Scripting: Bash, Python
- Practices: ITIL, NIS2, Infrastructure as Code, Agile
