CONVAI2 Competition: Improving Chit-Chat Dialogue Models

Slide Note
Embed
Share

The CONVAI2 Competition aims to enhance chit-chat dialogue models by addressing issues such as inconsistent personality, lack of long-term memory, and generic responses. With a focus on PersonaChat dataset, participants are encouraged to submit models for evaluation with a chance to win $20,000. Submission deadline is Sep 30th, winners to be announced at NeurIPS 2018. Join the challenge to contribute to the advancement of conversational AI research.


Uploaded on Oct 11, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. CONVAI2 COMPETITION THE TASK Organizers: Mikhail Burtsev, Varvara Logacheva, Valentin Malykh, Ryan Lowe, Iulian Serban, Shrimai Prabhumoye, Emily Dinan, Douwe Kiela, Alexander Miller, Kurt Shuster, Arthur Szlam, Jack Urbanek and Jason Weston. Speaker: Jason Weston, Facebook AI Research Advisory board: Yoshua Bengio, Alan W. Black, Joelle Pineau, Alexander Rudnicky, Jason Williams.

  2. COMPETITION AIM: SHARED TASK TO FIND BETTER CHITCHAT DIALOGUE MODELS Few available datasets for non-goal-oriented dialogue (chit-chat). Currently no standard evaluation procedure. ConvAI2 Goals: A concrete scenario for testing chatbots A standard evaluation tool datasets are open source baselines and evaluation code (automatic + MTurk) are open source encourage all competitors to open source code, winner must be

  3. CONVAI2 : CONVERSATIONAL AI CHALLENGE This is the second ConvAI Challenge. Last year focused on dialogue based on a news/Wikipedia article. This year we aim to improve over last year: Providing a dataset from the beginning, PersonaChat Making the conversations more engaging for humans Clearer evaluation process (automatic evaluation, followed by human eval)

  4. PERSONACHAT DATASET (Zhang et al., 2018) http://parl.ai/ The dataset consists of 164,356 utterances in 11k dialogs, over 1155 personas

  5. COMPETITION AIM: SHARED TASK TO FIND BETTER CHITCHAT DIALOGUE MODELS Common issues with chit-chat models include: (i) no consistent personality (Li et al., 2016) (ii) no long-term memory; in seq2seq last lines of dialogue are input (Vinyals et al., 2015) (iii) tendency for generic responses, e.g.``I don t know (Li et al., 2015). ConvAI2 aims to find models that address those issues.

  6. PRIZE The winning entry will receive $20,000 in Mechanical Turk funding in order to encourage further data collection for dialogue research.

  7. SCHEDULE Competitors submit models until Sep 30th Evaluated on hidden test set via automated metrics (PPL, hits@1, F1). Leaderboard visible to all competitors. `Wild live evaluation on Messenger/Telegram can be used to tune models. Sep 30th: system locked Best 7 performing systems make it to the next round Evaluated using Mechanical Turk and the `wild evaluation Winners announced at NeurIPS 2018.

  8. RULES Competitors must indicate which training sources are used Can augment training with other data as long as publicly released (and hence, reproducible) Competitors must provide their source code so hidden test set evaluation can be computed so that the competition has further impact in the future Code can be in any language but a thin python wrapper must be provided in order to work with our evaluation code via ParlAI s interface.

Related


More Related Content