Semantic normalization on graph models searching structure matchings in code

This study proposes an innovative approach to detecting structural matches in programming codes, which addresses the fundamental limitation of existing methods – their sensitivity to syntactic changes while maintaining logical equivalence. A hybrid architecture integrating semantic normalization through large language models (LLMs) with multispecies graph representation (AST, CFG, DFG) and embedding techniques from Graph Neural has been developed Networks (GNN) and Transformer models.