The Odia Generative AI (in short, OdiaGenAI) is an initiative to research Generative AI and Large Language Models (LLMs) for the low-resource Odia language.
The OdiaGenAI aims to
The data, code, and models will be available to the public for research and non-commercial purposes.
First: Though many LLMs support multilingual, including Odia language, the performance for various tasks (e.g., content generation, question-answering) is limited due to the amount of ingested data for Odia.
Second: There are subscriptions or fees associated with the high-performing LLMs.
Third: The usage (privacy) and bias of data input to these LLMs are in question.
We have divided the primary focus areas into three parts.
1. Literature Survey: Investigate the latest developments in Generative AI and LLMs and analyze current methods to support the Odia language for different tasks.
2. Development: Developing pre-trained and fine-tuned Odia LLM, which includes dataset preparation, model training, evaluation, prompt engineering, and API development.
3. Deployment: Deploy the Odia LLM models for public access for research and non-commercial purposes.
The models (pre-trained/fine-tuned) will be available through Hugging Face for research and non-commercial purposes. Feel free to contact us for a domain-specific application or particular use cases.
There are several use cases of OdiaGenAI LLMs. Three primary domains relating to Odisha which we are focusing to use the developed LLM are:
About our logo: The critically endangered Olive Ridley sea turtle is the world's smallest and most prevalent marine turtle. Travel thousands of kilometers in the ocean for nesting. The Gahirmatha Marine Sanctuary in Odisha is the largest known mass nesting rookery for olive ridley sea turtles worldwide.
If you find this repository useful, please consider giving 👏 and citing:
@misc{OdiaGenAI,
author = {Shantipriya Parida and Sambit Sekhar and Soumendra Kumar Sahoo and Swateek Jena and Abhijeet Parida and Satya Ranjan Dash and Guneet Singh Kohli},
title = {OdiaGenAI: Generative AI and LLM Initiative for the Odia Language},
year = {2023},
publisher = {Hugging Face},
journal = {Hugging Face repository},
howpublished = {\url{https://huggingface.co/OdiaGenAI}},
}
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.