Design of Dimensionality Reduction Algorithm for High-Dimensional Large-Scale Translation Corpora and Lightweight Translation Model Training

Addressing the "curse of dimensionality" problem caused by the exponential growth of translation corpora in current machine translation research, and the practical bottlenecks of large-scale model deployment difficulties and high inference latency in resource-constrained scenarios, this paper designs a dimensionality reduction algorithm that integrates feature selection and deep reconstruction, and constructs a lightweight translation model training framework by combining knowledge distillation