We are looking for a Senior Data Engineer with exposure to Agentic AI concepts who can bridge modern data platforms with intelligent automation. The ideal candidate will have strong hands-on experience in AWS-based data engineering and working knowledge of emerging frameworks like LangGraph and Agentic AI use cases, especially around code generation and workflow automation.
Key Responsibilities
Data Engineering:
- Design and build scalable data pipelines using AWS Glue and PySpark
- Implement data lake architectures leveraging AWS Lake Formation
- Develop and optimize data models and transformations in Snowflake
- Ensure data quality, governance, and performance optimization across pipelines
- Work with structured and semi-structured data at scale
- Contribute to Agentic AI use cases, especially:
- Automated code generation (ETL / SQL / PySpark)
- Intelligent pipeline orchestration
- Work with agent orchestration frameworks like LangGraph (intermediate level)
- Understand and experiment with AgentCore or similar agent execution environments
- Collaborate with AI/ML teams to integrate LLM-driven workflows into data platforms
- Build proof-of-concepts (POCs) for AI-assisted data engineering
Required Skills
- Strong experience in:
- AWS Glue
- AWS Lake Formation
- PySpark (hands-on coding)
- Snowflake (data warehousing & performance tuning)
- Good understanding of:
- Data lake & lakehouse architectures
- ETL/ELT design patterns
- Data governance and security
- Experience or exposure to:
- Agentic AI use cases (preferably code generation / automation)
- Working knowledge of:
- Basic understanding of:
- AgentCore or similar frameworks
Good to Have
- Experience with LLM integrations (OpenAI, Bedrock, etc.)
- Understanding of prompt engineering & workflow chaining
- Exposure to MLOps / LLMOps concepts
- Knowledge of vector databases / RAG pipelines