Seasoned Data Engineer with a proven ability to design, implement, and optimize data-driven solutions. Proficient in leveraging modern technologies such as Microsoft Azure and Snowflake to develop scalable and efficient data pipelines. Strong background in Big Data Analytics, with extensive experience in Talend (Big Data, Data Integration), Apache NiFi, and Informatica. Skilled in processing large datasets using the Hadoop and Spark ecosystem, writing SparkSQL scripts, and working with tools like Hive, Sqoop, and HDFS. Exceptional problem-solving abilities, adaptability to emerging technologies, and commitment to continuous learning. Proven track record in data migration projects, including migrating from Greenplum to Hortonworks using SparkSQL and PySpark. Experienced in Agile Scrum methodologies, team-based project implementation, and preparing detailed technical and functional documentation for client engagements. Recognized for delivering innovative solutions, optimizing data workflows, and contributing to production releases and system migrations across various industries.
Project: Revenue Leakage Identification – Spark Audit Interface (SAI)
Technologies: Azure (ADF, Power Automate, ADO, BLOB, Power BI), Databricks (Python), Snowflake
Overview:
Designed and deployed a Spark-based audit interface to streamline revenue assurance processes and identify potential revenue leakages across multiple business streams.
Key Contributions:
Key Achievements:
Project: DataGPT – AI-Powered Analytics Assistant
Technologies: Azure (ADF, ADO, BLOB, Power BI), Snowflake
Overview:
Developed DataGPT, an AI-powered analytics assistant designed to simplify and automate data analysis through natural language queries, enabling business users to extract actionable insights seamlessly.
Key Contributions:
Key Achievements:
Project: Multicloud Management Services (MCMS – R&D)
Technologies: Cloudera Data Platform, Ansible Playbook, Hadoop, Spark, Python, Ranger, AWS, Acceldata
Overview:
Developed scalable and automated solutions for managing hybrid and multicloud infrastructures across AWS, Azure, GCP, and Alibaba Cloud. Enhanced operational efficiency through advanced monitoring, reporting, and service optimization.
Key Contributions:
Key Achievements:
Project: Talend to NiFi Migration
Duration: Oct 2018 – Dec 2019
Technologies: TIS (Talend Enterprise), TOS (Talend Open Studio), Apache NiFi
Overview:
Migrated Talend-based ETL frameworks to Apache NiFi, optimizing performance and ensuring system stability.
Key Contributions:
Project: Greenplum to Hortonworks Migration
Duration: Sep 2017 – Oct 2018
Technologies: Hive, Spark SQL, PySpark, Python, Talend
Overview:
Executed a seamless migration of Greenplum database workflows to Hortonworks, leveraging Spark and Hive for efficient data processing.
Key Contributions:
Project: GE Transportation Datalake
Duration: Dec 2016 – Aug 2017
Technologies: Greenplum, Talend
Overview:
Designed and implemented ETL solutions for GE Transportation’s Datalake, ensuring robust and automated data management.
Key Contributions:
Project: ARR – Asia Regulatory Reporting (APAC)
Role: Talend and Java Developer
Technologies: Talend (ETL), UNIX, Oracle PL/SQL, Java/J2EE, Spring, Adobe Flex
Overview:
Developed a robust reporting system to meet compliance requirements for financial institutions in the APAC region, delivering automated trade and valuation reports for regulatory bodies such as FSC and GTSM.
Key Contributions:
Project: Daymon Interactions BI
Role: Talend Developer and Team Lead
Technologies: Talend (ETL), SQL, Redshift, Java, Tableau, AWS
Overview:
Implemented data integration and automation solutions to optimize retail service operations for Daymon Worldwide, enabling real-time insights and improved brand loyalty through data-driven marketing.
Key Contributions:
Project: Data Ingestion Framework
Role: Talend and Hadoop Developer
Technologies: Talend (Big Data), MySQL, Hadoop, HDFS, Hive
Overview:
Developed a data ingestion framework to extract, transform, and load data from diverse relational systems into Hadoop, enabling scalable data processing for business analytics.
Key Contributions:
Azure Data Platform(ADF, Power Automate, Blob, ADO)
Talend Administration
AWS
Talend Administration