Summary
Overview
Work History
Education
Skills
Websites
Personal Information
Languages
Hobbies and Interests
Techstack
Professional Development
Affiliations
Languages
Timeline
Generic
Xi Chen

Xi Chen

Vienna,Austria

Summary

Data Scientist with 5 years’ experience leveraging cutting-edge AI and Machine Learning techniques, with a particular emphasis on Natural Language Processing and the implementation of microservices.

Overview

6
6
years of professional experience

Work History

PwC Austria
, Vienna
10.2022 - 10.2024

Conduct data modeling, statistical analysis, and machine learning to extract insights and make data-driven decisions.

Present projects to clients, effectively communicating complex technical concepts.

Built AI-powered chatbots using large language models (LLMs), Retrieval-Augmented Generation (RAG), and Knowledge Graphs.
Leveraged Azure cloud services for deployment and scaling.

Automated Invoice Categorization

  • Developed an algorithm for automated invoice classification and data sorting based on specified criteria.
  • Implemented intelligent label suggestions using content analysis.

Master Data Chatbot

  • Cleaned and indexed PDF-based data using Azure for efficient retrieval.
  • Created a chatbot using Streamlit for seamless interaction with master data.

Knowledge Graph-Based Chatbot

  • Extracted data from PDFs and created a Graph Database using Neo4j, LlamaIndex, LangChain, and LLMs.
  • Built a chatbot with Streamlit to interact with the knowledge graph for intelligent query handling.

Asset Discovery

  • Designed and implemented several Object-Oriented Programming (OOP) modules aimed at automated asset discovery for web services.
  • Developed a Python script to authenticate and send HTTP requests to targeted websites, successfully pulling relevant asset data and created a separate data-cleaning module to filter and transform the raw data into a structured CSV format, enabling easier analysis and reporting.
  • Utilized Docker containers for an isolated testing environment, ensuring the robustness and reliability of the asset discovery modules.

Data Protection Guru

  • Engineered a comprehensive tool that utilizes machine learning techniques for data extraction and manipulation. Implemented data extraction algorithms to parse XML files derived from PDF documents, efficiently collating necessary information.
  • Designed and populated an embedding database to store the extracted data, optimizing for query performance and data integrity.
  • Integrated the Llama2 machine learning model to facilitate interactive chatting capabilities with stored PDF documents, executed locally on a CPU.

Data Scientist

PwC Austria
Vienna, Austria
10.2019 - 10.2022

Utilized Python programming to deliver tailored solutions, fulfilling numerous project requirements and wishes. Demonstrated versatility in addressing a wide range of challenges and delivering efficient outcomes for various endeavors.

Conducted classes on introducing Natural Language Processing (NLP) concepts in PwC Data Science Academy, effectively communicating complex topics to non-technical audiences.

Identification of ominous bank behavior and efficient handling of complaints and requests

Implementation of automated data labelling

  • Data modelling using Natural Language Processing
  • Implementation of various Machine Learning algorithms for forecasting
  • Visualization and presentation of results to communicate findings

Bias Eliminator

Inception and development of an application to detect bias in text, using evidence-based research methods. In the first stage, the app supports the detection and mitigation of gender bias in job advertisements.

  • Literature research, designing the product functionality
  • Utilising state-of-the-art machine learning and natural language processing methods to detect and neutralise biases
  • Implementation of microservices for orchestration between functionalities

Automated Document Redaction

  • Implementation of AI models for documents anonymization and knowledge extraction
  • Developed solutions for document type conversion, integrating them across various stages of the code
  • Successfully deployed sophisticated anonymization functions for metafiles.
  • Orchestrated microservices to optimize document processing
  • Ensured data security through effective debugging for security alerts.

VR Personality Assessment Tool for HR

  • Data Extraction from VR log files for comprehensive insights
  • Statistical validation, proof of concepts
  • Automated the evaluation process for efficient reporting

Anomaly Detection

  • Implemented Supervised Machine Learning algorithms
  • Developed an automated highlighting function in Python to detect and highlight anomalies across Excel files, enhancing data analysis efficiency.

Volontariat

PwC Austria
Vienna, Austria
07.2019 - 09.2019
  • Gained Essential Python and Machine Learning Skills
  • Implemented Supervised Machine Learning algorithms
  • Developed an automated highlighting function in Python to detect and highlight anomalies across Excel files, enhancing data analysis efficiency

Volontariat

DAGOPT Optimization Technologies
Vienna, Austria
03.2019 - 06.2019
  • Conducted data processing tasks to clean and preprocess raw datasets for time series analysis with R
  • Utilized various statistical and machine learning techniques to predict future discounts based on historical data

Education

BSc. - Mathematics

University of Vienna
07.2020

Skills

  • Python
  • Data Science
  • Machin Learning
  • Natural Language Processing
  • LLM
  • Knowledge Graph
  • Azure
  • DataBricks
  • Git
  • Docker
  • FastApi
  • Time-series analysis

Personal Information

  • Date of Birth: 09/26/95
  • Nationality: Austria

Languages

  • Chinese, Native
  • German, Business fluent
  • English, Business fluent
  • Swedish, Intermediate

Hobbies and Interests

  • Muay Thai
  • Ski

Techstack

Python, ChromaDB, FastApi, Flask, Visual Studio.NET, Microsoft Office, Veracode

Professional Development

2019, Agile Scrum Master

Affiliations

  • Professional Muay Thai fighter competing in Thailand

Languages

English
Professional
German
Professional
Chinese (Mandarin)
Native/ Bilingual
Swedish
Limited
Thai
Elementary

Timeline

PwC Austria
10.2022 - 10.2024

Data Scientist

PwC Austria
10.2019 - 10.2022

Volontariat

PwC Austria
07.2019 - 09.2019

Volontariat

DAGOPT Optimization Technologies
03.2019 - 06.2019

BSc. - Mathematics

University of Vienna
Xi Chen