Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic
Neeraja Chekka

Neeraja Chekka

Sr Data Engineer
Wellington(Relocating To Auckland)

Summary

Seasoned Data Engineer with a proven ability to design, implement, and optimize data-driven solutions. Proficient in leveraging modern technologies such as Microsoft Azure and Snowflake to develop scalable and efficient data pipelines. Strong background in Big Data Analytics, with extensive experience in Talend (Big Data, Data Integration), Apache NiFi, and Informatica. Skilled in processing large datasets using the Hadoop and Spark ecosystem, writing SparkSQL scripts, and working with tools like Hive, Sqoop, and HDFS. Exceptional problem-solving abilities, adaptability to emerging technologies, and commitment to continuous learning. Proven track record in data migration projects, including migrating from Greenplum to Hortonworks using SparkSQL and PySpark. Experienced in Agile Scrum methodologies, team-based project implementation, and preparing detailed technical and functional documentation for client engagements. Recognized for delivering innovative solutions, optimizing data workflows, and contributing to production releases and system migrations across various industries.

Overview

9
9
years of professional experience
3
3
years of post-secondary education
2
2
Certifications

Work History

Sr. Data Engineer

Spark Ltd
Wellington
01.2023 - Current

Project: Revenue Leakage Identification – Spark Audit Interface (SAI)

Technologies: Azure (ADF, Power Automate, ADO, BLOB, Power BI), Databricks (Python), Snowflake

Overview:
Designed and deployed a Spark-based audit interface to streamline revenue assurance processes and identify potential revenue leakages across multiple business streams.

Key Contributions:

  • Architected a unified Spark-based audit framework, serving as a single source of truth for revenue validation.
  • Developed automated data validation processes to detect anomalies and revenue discrepancies across business streams such as WAN and LFC.
  • Implemented fuzzy matching algorithms for enhanced data accuracy and validation.
  • Built ETL pipelines to seamlessly ingest and transform data from diverse source systems.

Key Achievements:

  • Delivered an end-to-end automation solution for contract extraction and leakage identification.
  • Enabled proactive revenue monitoring and recovery through automated workflows for credit adjustments (overbilling) and revenue recovery (underbilling).
  • Prevented revenue leakages worth over NZD 1M by streamlining auditing processes.

Project: DataGPT – AI-Powered Analytics Assistant

Technologies: Azure (ADF, ADO, BLOB, Power BI), Snowflake

Overview:
Developed DataGPT, an AI-powered analytics assistant designed to simplify and automate data analysis through natural language queries, enabling business users to extract actionable insights seamlessly.

Key Contributions:

  • Integrated diverse data sources and developed API connections for streamlined data access.
  • Designed and implemented a robust response delivery system for efficient query handling.
  • Automated end-to-end data processing pipelines for data extraction and transformation, ensuring real-time insights.

Key Achievements:

  • Simplified complex data workflows, enabling business stakeholders to make data-driven decisions effectively.
  • Enhanced business intelligence through real-time availability of insights, driving improved operational efficiency.
  • Ensured data quality through rigorous testing, validation, and monitoring of all data assets, minimizing inaccuracies and inconsistencies.
  • Reengineered existing ETL workflows to improve performance by identifying bottlenecks and optimizing code accordingly.
  • Championed the adoption of agile methodologies within the team, resulting in faster delivery times and increased collaboration among team members.

Sr Data Engineer

EY
Wellington
08.2022 - 12.2022

Sr. Data Engineer

IBM
09.2020 - 07.2022

Project: Multicloud Management Services (MCMS – R&D)


Technologies: Cloudera Data Platform, Ansible Playbook, Hadoop, Spark, Python, Ranger, AWS, Acceldata

Overview:
Developed scalable and automated solutions for managing hybrid and multicloud infrastructures across AWS, Azure, GCP, and Alibaba Cloud. Enhanced operational efficiency through advanced monitoring, reporting, and service optimization.

Key Contributions:

  • Built and optimized data pipelines and profiling tools using Spark and Pandas.
  • Automated infrastructure management with Ansible Playbook for seamless configuration.
  • Leveraged Cloudera Data Platform and Hadoop for large-scale data processing.
  • Integrated Acceldata for data quality monitoring and performance enhancement.

Key Achievements:

  • Delivered a standardized multicloud management framework supporting SMB and enterprise clients.
  • Improved system reliability and operational efficiency using SRE principles.
  • Enabled businesses to seamlessly transition to and operate in hybrid cloud environments.

Sr. Data Engineer

GE Digital
10.2017 - 01.2020

Project: Talend to NiFi Migration
Duration: Oct 2018 – Dec 2019
Technologies: TIS (Talend Enterprise), TOS (Talend Open Studio), Apache NiFi

Overview:
Migrated Talend-based ETL frameworks to Apache NiFi, optimizing performance and ensuring system stability.

Key Contributions:

  • Analyzed the existing Talend framework to map functionalities for NiFi.
  • Converted Talend logic into NiFi workflows using equivalent native functions.
  • Conducted performance and regression testing to ensure seamless platform migration.
  • Stabilized the platform for enhanced operational reliability.


Project: Greenplum to Hortonworks Migration
Duration: Sep 2017 – Oct 2018
Technologies: Hive, Spark SQL, PySpark, Python, Talend

Overview:
Executed a seamless migration of Greenplum database workflows to Hortonworks, leveraging Spark and Hive for efficient data processing.

Key Contributions:

  • Migrated Greenplum artifacts, including mirror tables and consumption layers, to Hortonworks.
  • Translated Greenplum (pgsql) logic into Spark and Hive workflows.
  • Developed and scheduled Spark jobs in Talend to streamline data workflows.
  • Optimized code for improved system performance and scalability.


Project: GE Transportation Datalake
Duration: Dec 2016 – Aug 2017
Technologies: Greenplum, Talend

Overview:
Designed and implemented ETL solutions for GE Transportation’s Datalake, ensuring robust and automated data management.

Key Contributions:

  • Built Talend workflows for loading data into Greenplum for visualization teams.
  • Designed an automation framework to dynamically handle schema changes, resolving recurring production issues.
  • Engaged with end-users to gather and fulfill data requirements.
  • Played a key role in data discovery, onboarding critical datasets for business needs.

Associate Consultant

Infosys Ltd
05.2016 - 09.2017

Project: ARR – Asia Regulatory Reporting (APAC)

Role: Talend and Java Developer
Technologies: Talend (ETL), UNIX, Oracle PL/SQL, Java/J2EE, Spring, Adobe Flex

Overview:
Developed a robust reporting system to meet compliance requirements for financial institutions in the APAC region, delivering automated trade and valuation reports for regulatory bodies such as FSC and GTSM.

Key Contributions:

  • Designed and implemented Talend ETL pipelines for automated data processing and reporting.
  • Migrated Talend workflows from Windows to UNIX environments using shell scripts.
  • Created a user interface for critical data maintenance, enhancing reporting accuracy.
  • Led data quality initiatives to ensure accurate and valid reporting.
  • Conducted release planning and deployed DDL and Talend code in UAT and production environments.


Project: Daymon Interactions BI

Role: Talend Developer and Team Lead
Technologies: Talend (ETL), SQL, Redshift, Java, Tableau, AWS

Overview:
Implemented data integration and automation solutions to optimize retail service operations for Daymon Worldwide, enabling real-time insights and improved brand loyalty through data-driven marketing.

Key Contributions:

  • Developed and executed ETL pipelines in Talend, transforming source data into Redshift target schemas.
  • Automated data transfer and processing using AWS services such as S3 and ECS.
  • Enhanced system performance through SQL optimization and process redesign.
  • Integrated Tableau for business intelligence reporting, enabling actionable insights.
  • Conducted rigorous testing to ensure high-quality data delivery and reliability.


Project: Data Ingestion Framework

Role: Talend and Hadoop Developer
Technologies: Talend (Big Data), MySQL, Hadoop, HDFS, Hive

Overview:
Developed a data ingestion framework to extract, transform, and load data from diverse relational systems into Hadoop, enabling scalable data processing for business analytics.

Key Contributions:

  • Built ETL pipelines using Talend to automate data ingestion into HDFS and Hive tables.
  • Designed incremental and full-load processes for efficient data refreshes based on user flags.
  • Prepared low-level design documents and implemented optimized mappings.
  • Conducted code reviews, unit testing, and deployment activities to ensure production readiness.
  • Streamlined release processes, ensuring timely delivery of solutions.

Education

MTech - Data Science

Bits Pilani University
Bangalore, India
01.2018 - 01.2021

Bachelor of Engineering - Electronics and Communication Engineering

JNTU (Jawaharlal Nehru Technological University)
Andhra Pradesh, India
09-2010

Skills

Azure Data Platform(ADF, Power Automate, Blob, ADO)

Certification

Talend Administration

Timeline

Sr. Data Engineer

Spark Ltd
01.2023 - Current

Sr Data Engineer

EY
08.2022 - 12.2022

Sr. Data Engineer

IBM
09.2020 - 07.2022

AWS

01-2020

Talend Administration

09-2018

MTech - Data Science

Bits Pilani University
01.2018 - 01.2021

Sr. Data Engineer

GE Digital
10.2017 - 01.2020

Associate Consultant

Infosys Ltd
05.2016 - 09.2017

Bachelor of Engineering - Electronics and Communication Engineering

JNTU (Jawaharlal Nehru Technological University)
Neeraja ChekkaSr Data Engineer