Log inSign up
Genta Winata
1,252 posts
user avatar
Genta Winata
@gentaiscool
AI Researcher @CapitalOne AIF. Ex @TechAtBloomberg @BigScienceW @SFResearch @hkust. Working on multilingual and LLM #NLProc. Building @GrassrootsSci
gentawinata.com
Joined December 2011
855
Following
1,872
Followers
  • Pinned
    user avatar
    Genta Winata
    @gentaiscool
    Apr 25, 2025
    ⭐️We're thrilled to share that our paper WorldCuisines has been selected for the Best Theme Paper Award at NAACL 2025 @naaclmeeting! 🎉 A huge thank you to the reviewers and area chair for this incredible recognition — we’re truly honored. Massive gratitude to all our amazing
    2025.naacl.org
    Announcing the NAACL 2025 Award Winners!
    Heading into the last week before the conference, we’d like to announce our Award Winners. The Best Paper and Best Theme Paper winners will present at our closing session, but all papers (including...
    18K
  • user avatar
    Genta Winata
    @gentaiscool
    Sep 24, 2022
    🚨 Excited to share our new multilingual dataset, NusaX, covering 10 linguistically diverse low-resource Indonesian 🇮🇩 languages on the sentiment analysis & MT tasks. 📰 arxiv.org/pdf/2205.15960… We released the dataset github.com/IndoNLP/nusax #indonlp #nusax #NLProc @aclmeeting
  • user avatar
    Genta Winata
    @gentaiscool
    Apr 3, 2023
    It has been 3⃣ years since we started the first initiative on the Indonesian benchmark, IndoNLU, and built IndoBERT as the foundation of IndoNLP 🇮🇩. We have seen so much progress 🥳 Repo: github.com/IndoNLP/indonlu follow the🧵to explore the journey ⛵️ #indonesian #indonlp @NLProc
    22K
  • user avatar
    Genta Winata
    @gentaiscool
    Oct 17, 2024
    🌎 World Cuisines 🍣🍙🥧🍨🍱🍕🍪🥓🫕 🤔 Is this dumpling 🥟 known as gyoza, jiaozi, or momo? is it a Chinese, Japanese or Korean dish? We are excited to introduce WorldCuisines, a high-quality 1⃣ million massive-scale multilingual parallel VQA benchmark available in 30
    Image
    44K
  • user avatar
    Genta Winata
    @gentaiscool
    Jun 17, 2024
    🔥Exciting news! We present SEACrowd, the first Southeast Asian languages benchmark 🇮🇩🇲🇾🇰🇭🇵🇭🇸🇬🇻🇳🇧🇳🇲🇲🇹🇱🇱🇦🇹🇭 via crowdsourcing ⛏️ Paper: arxiv.org/abs/2406.10118 With 61 authors, nearly 1000 languages and SEACrowd has 36 indigenous languages & 13 tasks. #nlproc @aclmeeting 🧵👇
    36K
  • user avatar
    Genta Winata
    @gentaiscool
    May 26, 2023
    Does an LLM forget when it learns a new language? We systematically study catastrophic forgetting in a massively multilingual continual learning framework in 51 languages. Preprint: arxiv.org/abs/2305.16252 ⬇️🧵 The paper was accepted at #acl2023nlp findings #NLProc [1/4]
    Image
    27K
  • user avatar
    Genta Winata
    @gentaiscool
    Sep 11, 2020
    We are glad that our paper, IndoNLU is accepted at AACL-IJCNLP. It is the first-ever large benchmark for Indonesian with 12 NLU tasks! And one of the largest collaboration I have so far. We plan to release the preprint, code, data, and website soon! #aacl2020 #aacl #nlproc
    Image
  • user avatar
    Genta Winata
    @gentaiscool
    Jan 22, 2023
    🥳Exciting news @eaclmeeting 2023 A year-long work on building extremely low-resource NLP datasets (some from ZERO resource ➡️ a resource). NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages 🇮🇩 📰 Paper: arxiv.org/pdf/2205.15960… #NLProc @aclmeeting
    Image
    35K
  • user avatar
    Genta Winata
    @gentaiscool
    May 2, 2023
    🚨 We are excited to have our NusaCrowd paper accepted in #acl2023 Findings. It is the first collaborative initiative to collect Indonesian corpus 🇮🇩 It is a BIG momentum for IndoNLP to push datasets to be open & publicly accessible. arxiv.org/abs/2212.09648 #NLProc @aclmeeting
    Image
    25K
  • user avatar
    Genta Winata
    @gentaiscool
    Oct 31, 2024
    🎉If you're attending @emnlpmeeting, feel free to reach out! My team at @CapitalOne AI Foundations is hiring for full-time positions and internships (Ph.D.), with a particular focus on LLM pretraining / fine-tuning / alignment. I'm also open to meeting and discussing potential
    22K
  • user avatar
    Genta Winata
    @gentaiscool
    Feb 24, 2022
    Paper accepted by ACL 2022! Do you know that Indonesia has more than 700 spoken languages? And they also have many dialects 😮 This is our first work investigating the NLP challenges on those languages and dialects #acl2022nlp #acl2022 #NLProc @aclmeeting
    openreview.net
    One Country, 700+ Languages: NLP Challenges for Underrepresented...
    NLP research is impeded by a lack of resources and awareness of the challenges presented by under-represented languages and dialects. Focusing on the languages spoken in Indonesia, the second most...
  • user avatar
    Genta Winata
    @gentaiscool
    May 4, 2023
    Boom! 🤯 Our NusaX is awarded outstanding paper 😎 of #eacl2023. We are happy that the NLP community recognizes our work on underrepresented languages 🇮🇩😀 It is a remarkable milestone for #indonlp community. Thanks to all collaborators! #NLProc @eaclmeeting
    Image
    Image
    10K
  • user avatar
    Genta Winata
    @gentaiscool
    Jun 25, 2022
    Introducing NusaCrowd 🇮🇩, a crowd source project to collect resources and benchmarks on Indonesian languages, and inviting contributors to share your datasets to our repository. ✅ Get co-authorship ✅ Open data Check our Github github.com/IndoNLP/nusa-c… @aclmeeting #NLProc
  • user avatar
    Genta Winata
    @gentaiscool
    Oct 21, 2024
    🤔 What is the most effective metric for summarization? Is it 1⃣ BLEU, 2⃣ ROUGE, or perhaps 3⃣ METEOR? ❔How can we develop a metric that aligns closely with human preferences? ✨We present MetaMetrics, a calibrated meta-metric specifically designed to evaluate generation
    Image
    21K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up