Summary
Overview
Work History
Education
Skills
Project Experience
References
Timeline
Generic

Shirley YongLin He

Auckland

Summary

Results-driven Data Engineer with experience at IBM and HuaZhong University of Science and Technology, specializing in ETL processes, data analytics, and pipeline optimization. Skilled in SQL, Python, and BI tools to transform complex datasets into actionable insights. Proven track record of optimizing data workflows, developing interactive dashboards, and enhancing decision-making. Adept at solving complex problems and collaborating with cross-functional teams to deliver impactful data solutions.

Overview

3
3
years of professional experience

Work History

Data Engineer Intern

HuaZhong University Of Science And Technology
Wuhan, Remote
11.2024 - 02.2025
  • Optimized SQL queries to enhance search efficiency in a high-volume database with millions of entries, enabling flexible search criteria.
  • Designed and implemented robust ETL/ELT pipelines to streamline data processing, ensuring well-organized and efficient data analysis.
  • Ensured data consistency and integrity by developing a PostgreSQL/PostGIS database aligned with the Darwin Core standard, leveraging MVCC mechanisms.
  • Built habitat suitability models incorporating elevation, slope, and aspect data, supporting ecological conservation planning and field research logistics using Dijkstra’s algorithm for route optimization.
  • Partnered with cross-functional teams to define database requirements, manage data warehousing, and develop interactive dashboards for data visualization.
  • Applied strategic problem-solving skills to analyze complex challenges, prioritize tasks, and manage dependencies, ensuring timely project delivery.

Data Engineer Intern

IBM
Wuhan, China
06.2022 - 01.2023
  • Processed large datasets by importing CSV files, extracting key information using regular expressions, and structuring data into a relational SQL database.
  • Designed and implemented data models using entity-relationship (ER) modeling and star schema design, developing an interactive Power BI dashboard for data visualization.
  • Documented methodology, limitations, and recommendations in a structured analytical report, proposing alternative approaches for long-term business insights.
  • Took ownership of the project by collaborating with multiple stakeholders to drive the end-to-end delivery of the data analysis initiative.
  • Applied strong problem-solving skills under mentorship, proactively seeking guidance, identifying growth opportunities, and establishing weekly one-on-one meetings to track progress.

Education

Master of Information Technology -

University of Melbourne
Australia
12.2024

Bachelor of Data Science -

University of Melbourne
Australia
12.2022

Skills

  • SQL/Python/R
  • MySQL/PostgreSQL/MongoDB
  • ER Modeling /Star Schema/Regression Models /Bayesian Inference /Random Forest/Transformer-based Models
  • Power BI/Tableau/Matplotlib
  • AWS /Azure
  • Prioritization & Problem-Solving
  • Teamwork & Communication
  • Stakeholder Management

Project Experience

Machine-Generated vs. Human-Written Text Classification (02/2024 - 06/2024) Rugby Match Data Processing & Analysis (02/2022 - 06/2022) Hospitality Revenue Analysis Dashboard (02/2025 - 03/2025) Sales & Profit Analysis Dashboard (12/2024 - 01/2025)

  • Conducted EDA on 180K+ imbalanced text samples, identifying key distribution patterns.
  • Applied TF-IDF vectorization and hybrid resampling to improve model robustness.
  • Designed and optimized ensemble classifiers, boosting predictive accuracy.
  • Deployed a high-precision model for detecting machine-generated text, validated on benchmarking platforms.
  • Built an NLP-based pipeline to extract and analyze match reports for performance insights.
  • Identified team trends using Python, Matplotlib, and Seaborn.
  • Investigated media bias by analyzing quote frequency vs. team performance.
  • Enhanced text processing using TF-IDF, Word2Vec, CountVectorizer, and POS-based filtering.
  • Developed an interactive Power BI dashboard to analyze occupancy rates, ADR, and RevPAR.
  • Applied advanced visuals and trend analysis to enhance data-driven decision-making.
  • Improved stakeholder insights through dynamic filters and responsive design.
  • Enabled optimized revenue management across multiple hotel properties.
  • Created a Power BI dashboard to track sales, revenue, and profit margin trends.
  • Leveraged interactive charts to highlight market performance and growth opportunities.
  • Provided actionable insights for sales and marketing teams, enhancing profitability and efficiency.

References

References available upon request.

Timeline

Data Engineer Intern

HuaZhong University Of Science And Technology
11.2024 - 02.2025

Data Engineer Intern

IBM
06.2022 - 01.2023

Master of Information Technology -

University of Melbourne

Bachelor of Data Science -

University of Melbourne
Shirley YongLin He