About the Team The Global System Service team owns the infrastructure services and management solutions that power ByteDance's data centers outside of China — from day-to-day operations to long-term architecture design and maintenance. The team specializes in composing end-to-end solutions by drawing on both open-source community tools and in-house developed products, tailored to both the business requirements and the operational complexities of large-scale infrastructure across ByteDance's non-China regions. Our mission is to deliver efficient infrastructure solutions and a stable, secure system environment for ByteDance's global business.
Responsibilities We are looking for a self-motivated system engineer that is equipped with SRE mindset and DevOps skills. Your responsibilities will include: - Manage and maintain large-scale host infrastructure across ByteDance's non-China data centers, covering OS lifecycle management, configuration standardization, and fleet-wide health monitoring. - Own the reliability and availability of core data center foundational services, including DNS, NTP, DHCP, NAT, APT repository, and Kerberos authentication. - Design and implement deployment architectures for foundational services, ensuring high availability, fault tolerance, and disaster recovery across regions. - Develop and enforce SLOs for managed services; lead incident response, root cause analysis, and post-mortem reviews to drive continuous reliability improvements. - Collaborate with network, security, and application teams to ensure foundational services meet the evolving demands of global business growth. - Identify automation opportunities across host management and service operations; drive tooling and process improvements to reduce toil and increase operational efficiency.