Minimum qualifications:
- Bachelor's degree or equivalent practical experience.
- 8 years of experience in software development.
- 5 years of experience building and developing infrastructure, distributed systems or networks, or experience with compute technologies, storage, or hardware architecture.
- 5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture.
Preferred qualifications:
- Master’s degree or PhD in Engineering, Computer Science, a related technical field, or equivalent practical experience.
- 8 years of experience with data structures and algorithms.
- 3 years of experience working in a complex, matrixed organization including technical leadership role leading project teams and setting technical direction.
- 3 years of experience with GenAI techniques (e.g., LLMs, Multi-Modal, Large Vision Models) or with GenAI-related concepts (e.g., language modeling, computer vision).
About the job
Google Cloud’s mission is to make every business successful through AI by combining cutting-edge technology, infrastructure, and talent. AI/ML software engineers in Cloud bridge the gap between pioneering models and a massive product vehicle reaching billions. Our talent density and AI-powered tools drive rapid development, rooted in a culture of empowerment and a bias to action. In this role, you aren’t just building technology; you’re shaping the frontier of enterprise and driving the evolution of advanced models.
The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world.
We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud’s Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.
Responsibilities
- Own the technical goal and architecture of the Vector Search Serving infrastructure and drive decisions that ensure the system can manage massive datasets and query volumes with low latency.
- Contribute to critical code paths, conduct high-quality code reviews, and author technical design documents to maintain a technical presence.
- Advocate for reliability and service health by overseeing incident management processes, ensuring that customer issues are resolved effectively and that post-mortems lead to systemic improvements.
- Act as the bridge between technology and users by understanding customer use cases and ensuring the infrastructure evolves to meet their changing needs.