Toolshed of Dingyi
Welcome to my toolshed, with all magics that I have collected all around. All my current projects are in Portfolio, others are in Old Projects. The contents are kept updating.
Portfolio
- 
      
        
          From Notion Research Hub to Data Warehouse
          
            database
          
        
        
 Architected a live analytical pipeline from Notion to Snowflake, mirroring enterprise DW best practices 
- 
      
        
          Difference Between PostgreSQL and MySQL
          
            database
          
        
        
 Deeper understanding of PostgreSQL and MySQL as open-source relational database management systems (RDBMS) 
- 
      
        
          From Slurm Batch Workflow to Kafka Pipline Using Python
          
            bigdata
          
        
        
 Reproduce a HAICORE/Slurm batch workflow as a real-time Kafka pipeline using Python and `kafka-python` 
- 
      
        
          Parrallel and Concurrent Computing in Python
          
            scaling
          
        
        
 How to increase efficiency in large-scale data processing 
- 
      
        
          Scientific Simulation vs Industrial A/B testing
          
            testing
          
        
        
 Different testing methods for laboratory and field trials 
- 
      
        
          Research Hub Chatbot Illustration
          
            LLM
          
        
        
 I built my own chatbot based on GPT-3.5 and LangChain 
- 
      
        
          Probabilistic Causal Effect Estimation for Panel Data
          
            Structual_Causal_Model Synthetic_Control_Method Decounfounding
          
        
        
 Master's Thesis 
Blog
- 2025-03-08» From Notion Research Hub To Data Warehouse
- 2025-03-02» Difference Between Postgresql And Mysql
- 2025-03-01» From Slurm Batch Workflow To Kafka Pipline Using Python
- 2025-02-20» Spline Regression
- 2025-01-07» Simulation Study Design Vs Industry A B Test
- 2024-12-17» Parellel And Concurrent Computation In Python
- 2024-03-01» Research Chatbot Built Upon Large Language Model And Langchain
- 2023-08-11» Large Language Models
- 2023-06-20» Data Mesh
- 2023-06-15» Probabilistic Causal Effect Estimation
- 2022-08-16» Distributional Regression
- 2022-08-15» Uplift Modeling
- 2022-08-12» Prediction Of Return
- 2022-08-01» Explainable Scores For Randomized Tree Ensembles
- 2022-02-18» German English Translator
- 2018-02-01» Stocks Assessment T T M
- 2018-01-03» Taichung Gastronomy Map
- 2017-12-29» Investment Psychology
Old Projects
- 
      
        
          Basic Introduction to Spline Regression
          
            Spline
          
        
        
 The relationship between the degree of freedom and the number of knots in the spline 
- 
      
        
          LLMs learning notes
          
            LLM
          
        
        
 Build a pizza ordering chatbot! 
- 
      
        
          How to predict risk in return by an online retailer?
          
            BA LightGBM
          
        
        
 Assignment of Business Analytics and Data Science WiSe20/21 (HU Berlin) 
- 
      
        
          How to predict property prices for Airbnb listings in London?
          
            BA NLP ImageProcessing
          
        
        
 Assignment of Advanced Data Analytics for Management Support SoSe22 (HU Berlin) 
- 
      
        
          How to optimize advertisement strategy via casual-neural-network-based uplift modeling?
          
            CausalML
          
        
        
 Seminar report of Applied Predictive Analytics SoSe21 (HU Berlin) 
- 
      
        
          How to build a German-English translator based on PyTorch?
          
            NLP
          
        
        
 Course project of Introduction to Natural Language Processing WiSe21/22 (HU Berlin) 
- 
      
        
          CFC or SHAP Values? An empirical study for comparison
          
            XAI
          
        
        
 Core idea of the conference paper "Approximation of SHAP Values for Randomized Tree Ensembles" published by Springer Nature Switzerland AG 2022 
- 
      
        
          How to build a grammar corrector based on PyTorch?
          
            NLP
          
        
        
 Course project of Introduction to Natural Language Processing WiSe21/22 (HU Berlin) 
- 
      
        
          How to build a text generator based on PyTorch?
          
            NLP
          
        
        
 Course project of Introduction to Natural Language Processing WiSe21/22 (HU Berlin) 
- 
      
        
          Conditional Income Distribution Across Age and Race in the U.S. in 2016
          
            DistributionalRegression
          
        
        
 Term paper of Econometric Projects WiSe21/22 (HU Berlin) 
- 
      
        
          Empirical Dynamic Modeling and Convergent Cross-Mapping
          
            EDM CCM
          
        
        
 Final project of Applied Time Series Course SoSe18 (CUC) 
- 
      
        
          How to assess the stock price of TTM
          
            Long-short-MA BollingerBand StockAnalysis
          
        
        
 Final project of Securities Investment Course WiSe17/18 (FCU) 
- 
      
        
          How to detect anxiety from Taiwan Index 50
          
            PCA StockAnalysis
          
        
        
 Final project of Software for Data Analysis Course WiSe17/18 (FCU) 
- 
      
        
          How to find the best accommodation for foodies in Taichung based on Kmeans
          
            KmeansClustering
          
        
        
 Final project of R Programming Course WiSe17/18 (FCU) 
- 
      
        
          Trickle-down effect in China from the perspective of social comparison
          
            FixedEffectsOLS 2SLS CFPS
          
        
        
 Term paper of Econometrics Course SoSe17 (CUC) 
- 
      
        
          Influence of parent-child communication on the mental state of adolescents
          
            Clustering PCA
          
        
        
 Term paper of Multivariate Statistical Analysis Course SoSe17 (CUC) 
- 
      
        
          GLMMs in insurance pricing via SAS
          
            GLMMs
          
        
        
 Final project of SAS Course SoSe17 (CUC)