• Login
    View Item 
    •   DSpace Home
    • 2-DERGİLER
    • 03) Bitlis Eren Üniversitesi Fen Bilimleri Dergisi
    • Cilt 14, Sayı 1 (2025)
    • View Item
    •   DSpace Home
    • 2-DERGİLER
    • 03) Bitlis Eren Üniversitesi Fen Bilimleri Dergisi
    • Cilt 14, Sayı 1 (2025)
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Improving Text-to-Sql Conversion for Low-Resource Languages Using Large Language Models

    Thumbnail
    View/Open
    Tam Metin/Full Text (1.403Mb)
    Date
    2025-03-26
    Author
    Öztürk, Emir
    Metadata
    Show full item record
    Abstract
    Accurate text-to-SQL conversion remains a challenge, particularly for low-resource languages like Turkish. This study explores the effectiveness of large language models (LLMs) in translating Turkish natural language queries into SQL, introducing a two-stage fine-tuning approach to enhance performance. Three widely used LLMs Llama2, Llama3, and Phi3 are fine-tuned under two different training strategies, direct SQL fine-tuning and sequential fine-tuning, where models are first trained on Turkish instruction data before SQL fine-tuning. A total of six model configurations are evaluated using execution accuracy and logical form accuracy. The results indicate that Phi3 models outperform both Llama-based models and previously reported methods, achieving execution accuracy of up to 99.95% and logical form accuracy of 99.95%, exceeding the best scores in the literature by 5–10%. The study highlights the effectiveness of instruction-based fine-tuning in improving SQL query generation. It provides a detailed comparison of Llama-based and Phi-based models in text-to-SQL tasks, introduces a structured fine-tuning methodology designed for low-resource languages, and presents empirical evidence demonstrating the positive impact of strategic data augmentation on model performance. These findings contribute to the advancement of natural language interfaces for databases, particularly in languages with limited NLP resources. The scripts and models used during the training and testing phases of the study are publicly available at https://github.com/emirozturk/TT2SQL.
    URI
    http://dspace.beu.edu.tr:8080/xmlui/handle/123456789/15967
    Collections
    • Cilt 14, Sayı 1 (2025) [37]





    Creative Commons License
    DSpace@BEU by Bitlis Eren University Institutional Repository is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 Unported License..

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     




    | Yönerge | Rehber | İletişim |

    sherpa/romeo

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsBy TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsBy Type

    My Account

    LoginRegister

    DSpace software copyright © 2002-2016  DuraSpace
    Contact Us | Send Feedback
    Theme by 
    Atmire NV