Research Projects
研究项目
VecMol
Published3D Molecular Generation — Vector-Field Representations
Proposed a novel vector-field representation for molecular generation by modeling continuous molecular fields instead of directly generating atomic coordinates. This approach enables more efficient and physically plausible 3D molecule generation. Accepted at ICML 2026.
Mass Spectrum ↔ Molecule Retrieval
In ProgressCross-modal Retrieval for Metabolite Identification
Developed a CLIP-style cross-modal retrieval framework that aligns tandem mass spectra with molecular representations. This enables direct retrieval of candidate molecules from spectral queries, facilitating metabolite identification without requiring exhaustive database search.
Knowledge-guided Molecular Retrieval
OngoingIncorporating Chemical Priors into Retrieval Models
Exploring how chemical fragmentation mechanisms and biological knowledge can be incorporated into retrieval models for metabolite identification. This project aims to move beyond pure embedding-based retrieval by integrating domain-specific reasoning about how molecules fragment and behave in biological systems. This is becoming the central project of my current research.
Graph Foundation Models
Shanghai AI LabLLM-driven Graph Data Generation
Contributed to GraphGen, a framework that enhances supervised fine-tuning for large language models with knowledge-driven synthetic graph data generation. This work explores how structured graph data can be generated and leveraged to improve LLM performance. Submitted to KDD and EMNLP.