Analysing Unified Embedding with Morphological Insight for Multilingual Text Representation
DOI:
https://doi.org/10.5269/bspm.82246Resumen
The increasing complexity of multilingual text representation challenges the design of embedding techniques that can effectively capture linguistic nuances across different languages. In this paper, we propose a unified embedding model that integrates morphological insights for multilingual text representation, specifically focusing on a dataset that spans English, Marathi, and Konkani languages. We explore various embedding methods, such as Fused Embeddings, Character-Level Embeddings, Word-Level Embeddings, and Contextualized Multilingual Embeddings, and demonstrate how these methods can be combined with morphological information for enhanced language understanding. Our approach is illustrated with a real-world dataset, and we evaluate its performance on downstream tasks like text summarization, highlighting the gains from fused embeddings. The comparative analysis of different embedding methods, along with a detailed workflow and mathematical modeling, provides insights into the strengths and weaknesses of each approach
Descargas
Publicado
Número
Sección
Licencia
Derechos de autor 2026 Boletim da Sociedade Paranaense de Matemática

Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
When the manuscript is accepted for publication, the authors agree automatically to transfer the copyright to the (SPM).
The journal utilize the Creative Common Attribution (CC-BY 4.0).



