Data Scientist Intern - Udayana
Jakarta
FinTech - Data Products & Infra – - /
Internship /
Remote
About the Role
We are looking for a dynamic, energetic intern to join our team and contribute to the development of our in-house Large Language Model (LLM). In this role, you'll tackle the challenges of data collection, validation, and labeling, directly impacting the quality and effectiveness of our LLM. You'll work closely with our data science and engineering teams, participating in daily standups, and collaborating on live production projects. Over the first six months, you’ll gain hands-on experience in managing and preparing large datasets, ensuring the quality of our AI models, and contributing to cutting-edge AI projects that drive real business value.
What You Will Do
- Collect and validate data from various sources to train our LLM.
- Analyze, review, and accurately label text data based on predefined guidelines.
- Ensure the consistency and quality of labelled data by following established protocols.
- Provide feedback on labelling guidelines and suggest improvements.
- Work closely with data science and engineering teams to meet project goals.
- Participate in daily standups and collaborate on projects.
- Translate existing benchmarks from Indonesian to Balinese/Bataknese.
- Given a Balinese/Bataknese passage, generate relevant questions and answers.
- Manually check and evaluate the model’s responses for accuracy and quality.
What You Will Need
- Fluency in Balinese/Bataknese
- Available to join for a minimum of 5 months.
- Proficiency in Python programming and basic data processing skills.
- Attention to detail, with a focus on maintaining high data quality standards.
- Strong communication skills for effective collaboration and documentation.
- Ability to work independently and as part of a team.
- Willingness to learn and adapt in a fast-paced environment.
- Ability to provide constructive feedback and contribute to team discussions.
- Paid Internship and Laptop provided.
About the Team
The GoTo Data Science team is at the forefront of building critical machine learning components that ensure GoTo remains a safe, trusted, and enjoyable platform for digital payments. Our diverse team combines skills in mathematics, statistics, machine learning, and deep learning to solve some of the most challenging business problems in GoPay. We are passionate about both the technical aspects of data science and the tangible business impact of our models. Our collaborative environment encourages sharing knowledge, discussing new ideas, and learning from one another through various internal forums and presentations.