I’m currently a Ph.D. student in Department of Civil and Environmental Engineering (Scientific Computation Concentration) at The Hong Kong University of Science and Technology, under the supervision of Prof. Sen Li. I have a very wide-ranging interest in topics related to AI. Currently, my research primarily focuses on the field of Ethical Smart City (Transportation), combined with Multi-Agent Deep Reinforcement Learning.
Previously, I received my B.Eng. degree from School of Computer Science and Engineering at Sun Yat-sen University in 2024, mentored by Prof. Kai Huang (Embodied AI), Prof. Ning Liu and Prof. Chengying Gao (Music AI). Additionally, I had the privilege of visiting the Shenzhen Research Institute of Big Data associated with The Chinese University of Hong Kong (Shenzhen) under the guidance of Prof. Guangxu Zhu for one year, where I also learned from Prof. Xiaoyang Li and Dr. Hang Li (Wireless Sensing & Network Optimization).
If you are interested in collaborating with me, please feel free to reach out via email. Beyond my primary research interests, I am also keen on NLP, few-shot learning, transfer learning, pre-training, cross domain tasks, time series tasks, embodied AI, HCI, and more.
Aside from academics, I’m a music enthusiast, having been a singer and guitarist & bassist in bands Tokamak Disruption (托卡马克崩坏) (alternative metal), NEWS (alternative rock, current my personal band) and Rights of Lethe (melodic death metal, dissolved). I have a strong passion for heavy rock music, particularly alternative punk, melodic death metal, dark metal, and deathcore. If you are interested in playing or collaborating with us, please don’t hesitate to contact me via email. For information about our shows and ticket purchases, you can visit XiuDong (秀动) and TongGan (同感).
Ph.D. in Civil Engineering (Scientific Computation), 2024~
The Hong Kong University of Science and Technology (Clearwater Bay Campus, Hong Kong)
B.Eng. in Computer Science and Technology (National Basic Subject Talent Training Plan), 2020~2024
Sun Yat-sen University (Guangzhou Campus)
C/C++ (CCF-CSP:320, Top 0.8%), Python, Matlab
Guitar, Bass, Keyboard, Ukulele, and Simple Drum
CCF Student Member (granted for free)TPC Membership: IEEE PIMRC 2024, IEEE WCNC 2024-2025Technical Reviewer: ICLR 2025, ACL 2025, IEEE ICME 2024-2025, IEEE ICASSP 2024-2025, IEEE IJCNN 2025, IEEE WCNC 2024-2025, IEEE PIMRC 2024, IEEE SMC 2023, IEEE MTAP
In recent years, Wi-Fi sensing has garnered significant attention due to its numerous benefits, such as privacy protection, low cost, and penetration ability. Extensive research has been conducted in this field, focusing on areas such as gesture recognition, people identification, and fall detection. However, many data-driven methods encounter challenges related to domain shift, where the model fails to perform well in environments different from the training data. One major factor contributing to this issue is the limited availability of Wi-Fi sensing datasets, which makes models learn excessive irrelevant information and over-fit to the training set. Unfortunately, collecting large-scale Wi-Fi sensing datasets across diverse scenarios is a challenging task. To address this problem, we propose CrossFi, a siamese network-based approach that excels in both in-domain scenario and cross-domain scenario, including few-shot, zero-shot scenarios, and even works in few-shot new-class scenario where testing set contains new categories. The core component of CrossFi is a sample-similarity calculation network called CSi-Net, which improves the structure of the siamese network by using an attention mechanism to capture similarity information, instead of simply calculating the distance or cosine similarity. Based on it, we develop an extra Weight-Net that can generate a template for each class, so that our CrossFi can work in different scenarios. Experimental results demonstrate that our CrossFi achieves state-of-the-art performance across various scenarios. In gesture recognition task, our CrossFi achieves an accuracy of 98.17% in in-domain scenario, 91.72% in one-shot cross-domain scenario, 64.81% in zero-shot cross-domain scenario, and 84.75% in one-shot new-class scenario. The code for our model is publicly available at https://github.com/RS2002/CrossFi.
Channel State Information (CSI) is the cornerstone in both wireless communication and sensing systems. In wireless communication systems, CSI provides essential insights into channel conditions, enabling system optimizations like channel compensation and dynamic resource allocation. However, the high computational complexity of CSI estimation algorithms necessitates the development of fast deep learning methods for CSI prediction. In wireless sensing systems, CSI can be leveraged to infer environmental changes, facilitating various functions, including gesture recognition and people identification. Deep learning methods have demonstrated significant advantages over model-based approaches in these fine-grained CSI classification tasks, particularly when classes vary across different scenarios. However, a major challenge in training deep learning networks for wireless systems is the limited availability of data, further complicated by the diverse formats of many public datasets, which hinder integration. Additionally, collecting CSI data can be resource-intensive, requiring considerable time and manpower. To address these challenges, we propose CSI-BERT2 for CSI prediction and classification tasks, effectively utilizing limited data through a pre-training and fine-tuning approach. Building on CSI-BERT1, we enhance the model architecture by introducing an Adaptive Re-Weighting Layer (ARL) and a Multi-Layer Perceptron (MLP) to better capture sub-carrier and timestamp information, effectively addressing the permutation-invariance problem. Furthermore, we propose a Mask Prediction Model (MPM) fine-tuning method to improve the model’s adaptability for CSI prediction tasks. Experimental results demonstrate that CSI-BERT2 achieves state-of-the-art performance across all tasks with relatively fast computation speeds. To facilitate future research, we will make our code and dataset publicly available upon publication. The dataset and code are publicly available at https://github.com/RS2002/CSI-BERT2.
Wireless sensing has recently found widespread applications in diverse environments, including homes, offices, and public spaces. By analyzing patterns in channel state information (CSI), it is possible to infer human actions for tasks such as person identification, gesture recognition, and fall detection. However, CSI is highly sensitive to environmental changes, where even minor alterations can significantly distort the CSI patterns. This sensitivity often leads to performance degradation or outright failure when applying wireless sensing models trained in one environment to another. To address this challenge, Domain Alignment (DAL) has been widely adopted for cross-domain classification tasks, as it focuses on aligning the global distributions of the source and target domains in feature space. Despite its popularity, DAL often neglects inter-category relationships, which can lead to misalignment between categories across domains, even when global alignment is achieved. To overcome these limitations, we propose K-Nearest Neighbors Maximum Mean Discrepancy (KNN-MMD), a novel few-shot method for cross-domain wireless sensing. Our approach begins by constructing a ``help set" using K-Nearest Neighbors (KNN) from the target domain, enabling local alignment between the source and target domains within each category using Maximum Mean Discrepancy (MMD). Additionally, we address a key instability issue commonly observed in cross-domain methods, where model performance fluctuates sharply between epochs. Further, most existing methods struggle to determine an optimal stopping point during training due to the absence of labeled data from the target domain. Our method resolves this by excluding the support set from the target domain during training and employing it as a validation set to determine the stopping criterion. We evaluate the effectiveness of the proposed method across several cross-domain Wi-Fi sensing tasks, including gesture recognition, person identification, fall detection, and action recognition, using both a public dataset and a self-collected dataset. In a one-shot scenario, our method achieves accuracy rates of 93.26%, 81.84%, 77.62%, and 75.30% for the respective tasks. To support future research, we will release our code and dataset to the public upon publication. The dataset and code are publicly available at https://github.com/RS2002/KNN-MMD.
Small robots encounter considerable difficulties in learning effective motions on complex terrains owing to their underactuated nature and nonlinear dynamics. In this paper, we present a novel framework for robot motion generation that implements reinforcement learning, based on simplified exploration of the robot’s action and time slice conduction. Our framework controls the robot’s actions using normalized signals and hierarchical mappings on mathematical space, which facilitates the learning process. We execute action in the timeslice to make efficient interaction with the environment. We evaluate the efficacy of our approach on a varied set of simulated terrain scenarios, which include various obstacles and terrain undulations. Our results show that our approach effectively achieves efficient motions on complex terrains designed for small-sized robots.