Key Responsibilities
Architecture & Design
- Architect end-to-end data integration solutions using Talend Data Integration, Talend Cloud, Talend Big Data, and related components
- Design ETL/ELT frameworks, data ingestion pipelines, workflow orchestration, and reusable Talend components
- Define architecture blueprints, HLD/LLD documentation, and integration patterns
- Lead modernization efforts including migration from legacy ETL tools to Talend
- Justify and recommend system sizing — hardware, memory, compute, cluster configurations — based on data volumes, throughput requirements, and SLA expectations
Development & Technical Leadership
- Strong hands-on development background with ability to build, review, and optimize complex Talend jobs across batch, real-time, and streaming workloads
- Build and optimize ETL/ELT pipelines integrating diverse data sources — databases, APIs, flat files, cloud platforms, and streaming systems
- Design and implement data integration via REST/SOAP APIs, HTTP connectors, and data services routes within Talend
- Implement complex transformations, data quality rules, profiling, cleansing, deduplication, metadata management, and lineage
- Leverage Talend Big Data components to process large-scale datasets on Hadoop, Spark, or cloud-native big data platforms
- Work across relational databases (Oracle, SQL Server, MySQL, PostgreSQL) and cloud storage solutions
- Provide hands-on guidance to development teams on Talend jobs, best practices, error handling, logging, and scalability
- Conduct architecture reviews, performance tuning, and optimization of Talend workloads
Streaming & Real-Time Integration
- Design and implement real-time and near-real-time data pipelines using Talend with Kafka, Spark Streaming, or equivalent streaming frameworks
- Architect event-driven integration patterns for high-throughput, low-latency data flows
- Monitor and tune streaming pipelines for performance, fault tolerance, and reliability
Data Quality & Governance
- Define and enforce data quality frameworks — profiling, validation rules, anomaly detection, and exception handling — within Talend pipelines
- Ensure alignment with enterprise data governance, security, compliance, and data lineage requirements
Operations & DevOps
- Oversee deployment, scheduling, monitoring, and maintenance of Talend jobs
- Collaborate with DevOps teams to design CI/CD pipelines for Talend solutions
- Troubleshoot production issues and ensure high availability and reliability of ETL workflows
Requirements
Required Skills & Experience
- 10+ years of overall data integration/ETL experience, with at least 4–5 years as a Talend architect or senior developer
- Strong hands-on development expertise in Talend Data Integration, Talend Cloud, Talend Big Data, and Talend Administration Center — must be able to build, not just design
- Proven experience designing HLD/LLD and integration architecture patterns
- Demonstrated ability to size and justify infrastructure and platform configurations based on workload profiling and capacity planning
- Hands-on experience with streaming technologies — Kafka, Spark Streaming, or equivalent — integrated within Talend pipelines
- Experience designing and consuming REST/SOAP APIs, HTTP-based connectors, and Talend Data Services for service-oriented integration patterns
- Strong expertise in data quality implementation — profiling, cleansing, validation, deduplication, and exception management within Talend
- Proficiency with Talend Big Data components and processing at scale on Hadoop, Spark, or cloud-native equivalents
- Experience with legacy ETL migration projects (Informatica, DataStage, SSIS to Talend preferred)
- Proficiency with relational databases (Oracle, SQL Server, MySQL, PostgreSQL) and cloud platforms (AWS, Azure, or GCP)
- Knowledge of CI/CD practices, Git-based version control, and DevOps tooling (Jenkins, GitLab CI, etc.)
- Understanding of data governance frameworks, metadata management, and lineage concepts
Good to Have
- Talend certification (Architect or Developer level)
- Exposure to cloud-native data platforms (Databricks, Snowflake, Redshift, BigQuery)
- Familiarity with enterprise data cataloguing and governance platforms (Collibra, Alation, etc.)
- Experience with containerised deployments (Docker, Kubernetes) for Talend workloads