Toolshed of Dingyi
Welcome to my toolshed, with all magics that I have collected all around. All my current projects are in Portfolio, others are in Old Projects. The contents are kept updating.
Portfolio
-
From Notion Research Hub to Data Warehouse
database
Architected a live analytical pipeline from Notion to Snowflake, mirroring enterprise DW best practices
-
Difference Between PostgreSQL and MySQL
database
Deeper understanding of PostgreSQL and MySQL as open-source relational database management systems (RDBMS)
-
From Slurm Batch Workflow to Kafka Pipline Using Python
bigdata
Reproduce a HAICORE/Slurm batch workflow as a real-time Kafka pipeline using Python and `kafka-python`
-
Parrallel and Concurrent Computing in Python
scaling
How to increase efficiency in large-scale data processing
-
Scientific Simulation vs Industrial A/B testing
testing
Different testing methods for laboratory and field trials
-
Research Hub Chatbot Illustration
LLM
I built my own chatbot based on GPT-3.5 and LangChain
-
Probabilistic Causal Effect Estimation for Panel Data
Structual_Causal_Model Synthetic_Control_Method Decounfounding
Master's Thesis
Blog
- 2025-03-08» From Notion Research Hub To Data Warehouse
- 2025-03-02» Difference Between Postgresql And Mysql
- 2025-03-01» From Slurm Batch Workflow To Kafka Pipline Using Python
- 2025-02-20» Spline Regression
- 2025-01-07» Simulation Study Design Vs Industry A B Test
- 2024-12-17» Parellel And Concurrent Computation In Python
- 2024-03-01» Research Chatbot Built Upon Large Language Model And Langchain
- 2023-08-11» Large Language Models
- 2023-06-20» Data Mesh
- 2023-06-15» Probabilistic Causal Effect Estimation
- 2022-08-16» Distributional Regression
- 2022-08-15» Uplift Modeling
- 2022-08-12» Prediction Of Return
- 2022-08-01» Explainable Scores For Randomized Tree Ensembles
- 2022-02-18» German English Translator
- 2018-02-01» Stocks Assessment T T M
- 2018-01-03» Taichung Gastronomy Map
- 2017-12-29» Investment Psychology
Old Projects
-
Basic Introduction to Spline Regression
Spline
The relationship between the degree of freedom and the number of knots in the spline
-
LLMs learning notes
LLM
Build a pizza ordering chatbot!
-
How to predict risk in return by an online retailer?
BA LightGBM
Assignment of Business Analytics and Data Science WiSe20/21 (HU Berlin)
-
How to predict property prices for Airbnb listings in London?
BA NLP ImageProcessing
Assignment of Advanced Data Analytics for Management Support SoSe22 (HU Berlin)
-
How to optimize advertisement strategy via casual-neural-network-based uplift modeling?
CausalML
Seminar report of Applied Predictive Analytics SoSe21 (HU Berlin)
-
How to build a German-English translator based on PyTorch?
NLP
Course project of Introduction to Natural Language Processing WiSe21/22 (HU Berlin)
-
CFC or SHAP Values? An empirical study for comparison
XAI
Core idea of the conference paper "Approximation of SHAP Values for Randomized Tree Ensembles" published by Springer Nature Switzerland AG 2022
-
How to build a grammar corrector based on PyTorch?
NLP
Course project of Introduction to Natural Language Processing WiSe21/22 (HU Berlin)
-
How to build a text generator based on PyTorch?
NLP
Course project of Introduction to Natural Language Processing WiSe21/22 (HU Berlin)
-
Conditional Income Distribution Across Age and Race in the U.S. in 2016
DistributionalRegression
Term paper of Econometric Projects WiSe21/22 (HU Berlin)
-
Empirical Dynamic Modeling and Convergent Cross-Mapping
EDM CCM
Final project of Applied Time Series Course SoSe18 (CUC)
-
How to assess the stock price of TTM
Long-short-MA BollingerBand StockAnalysis
Final project of Securities Investment Course WiSe17/18 (FCU)
-
How to detect anxiety from Taiwan Index 50
PCA StockAnalysis
Final project of Software for Data Analysis Course WiSe17/18 (FCU)
-
How to find the best accommodation for foodies in Taichung based on Kmeans
KmeansClustering
Final project of R Programming Course WiSe17/18 (FCU)
-
Trickle-down effect in China from the perspective of social comparison
FixedEffectsOLS 2SLS CFPS
Term paper of Econometrics Course SoSe17 (CUC)
-
Influence of parent-child communication on the mental state of adolescents
Clustering PCA
Term paper of Multivariate Statistical Analysis Course SoSe17 (CUC)
-
GLMMs in insurance pricing via SAS
GLMMs
Final project of SAS Course SoSe17 (CUC)