Advancements in Natural Language Processing for Scientific Research

CERN openlab Technical Workshop 2019

24/01/2019

Taghi Aliyev

•

A lot of replicative work in any scientific field

•

Non-reproducible research

•

Many different data structures and conventions -->

 Need for parsers…

•

High barriers to enter the research fields

•

Lack of common ground, all-in-one environments

•

Sparked out off discussion with the members of Medical Community

•

Genomics Analysis Experts, Professors in Bio-Informatics, personal experiences

Natural Language Processing Tools

Taghi Aliyev, IBM Meeting

•

Large-scale collaborative research platform

•

Main focus on ease-of-use, reproducibility of research

•

Use of Machine Learning for Narrative interfaces

•

Information Retrieval

•

Natural Language Processing (Chatbots)

•

Provide and host in-house solutions and projects

Taghi Aliyev, IBM Meeting

•

Lower the barriers for junior researchers

•

Enhance the way research is done for everyone

•

Chatbots as Personal Assistants

•

Information Retrieval and Question Answering:

Chatbots and Information Retrieval

Taghi Aliyev, IBM Meeting

•

Models being tested:

•

QANet

•

DSSM (Deep Semantic Similarity Models)

•

Recently released: BERT (Bidirectional Encoder Representations from Transformers)

•

Framework to host the models:

•

RASA

Models and Frameworks

Taghi Aliyev, IBM Meeting

Models – QANet; Combining local conv with global self-attention

Taghi Aliyev, IBM Meeting

Model – DSSM; Deep Semantic Similarity Model

Taghi Aliyev, IBM Meeting

•

Python-based tool

•

Allows for custom actions

•

Easing the integration of pre-trained models

Hosting Tool – RASA; Open Source tools for contextual AI Assistants

Taghi Aliyev, IBM Meeting

•

Understanding the reasoning and decision-making is crucial

•

Not very straight-forward for deep neural networks

•

More relevant for a conversational bot

•

Holding the model responsible when leads to accidents

•

Ability to back trace the effects and the outcome

•

Initial test case:

•

TwinsUK with KCL for feature extraction in heritability studies

•

Pre-trained CNN

Holding the models accountable and explainability

Taghi Aliyev, IBM Meeting

Deconvolutional Neural Networks

Taghi Aliyev, IBM Meeting

Some initial results

Taghi Aliyev, IBM Meeting

•

Results of initial tests on 2 twins

•

With 2 different ways to compute correlations

Perturbation on input image and correlation

Taghi Aliyev, IBM Meeting

Results on 2 twins

•

Last touches for the convolutional neural network

•

Next: Generalization to different network architypes

•

Especially for the textual cases

•

Not an investigated problem

•

Even more true in Medical Informatics

Where do we stand now?

Taghi Aliyev, IBM Meeting

•

Public/Social

•

GENIAL, Geneva Responsive City Camp

•

Research

•

SQuAD 2.0 Challenge

•

Vignette extraction and analysis

•

Education

•

Training tools/Personal Assistant

•

Still looking for partners and use cases

Application Areas and Use Cases

Taghi Aliyev, IBM Meeting

•

Deconvolution:

•

An interesting idea that can incorporated to the platform to provide insights

•

Conversational bots:

•

BERT proposes a generic and interesting approach

•

DSSM and QANet are proven to be of decent quality

•

Improvements are still required

•

Use Cases:

•

GENIAL case being presented upcoming Monday at AMLD

•

Has interest of Canton of Geneva and a dedicated testing group

Taghi Aliyev, IBM Meeting

Slide Note

Embed Share

Download

Explore the role of Natural Language Processing tools in overcoming barriers in scientific research by lowering entry barriers and enhancing research efficiency. Learn about models like QANet and DSSM, along with the use of Machine Learning in Narrative interfaces and Chatbots for information retrieval.

maximilian Follow

Uploaded on Aug 13, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

Smart Platforms for Science Smart Platforms for Science CERN openlab Technical Workshop 2019 Taghi Aliyev 24/01/2019 1

Background and Motivation Background and Motivation Natural Language Processing Tools A lot of replicative work in any scientific field Non-reproducible research Many different data structures and conventions --> Need for parsers High barriers to enter the research fields Lack of common ground, all-in-one environments Sparked out off discussion with the members of Medical Community Genomics Analysis Experts, Professors in Bio-Informatics, personal experiences Taghi Aliyev, IBM Meeting 2

Introduction to the Platform Introduction to the Platform Large-scale collaborative research platform Main focus on ease-of-use, reproducibility of research Use of Machine Learning for Narrative interfaces Information Retrieval Natural Language Processing (Chatbots) Provide and host in-house solutions and projects Taghi Aliyev, IBM Meeting 3

Natural Language Processing Natural Language Processing Chatbots and Information Retrieval Lower the barriers for junior researchers Enhance the way research is done for everyone Chatbots as Personal Assistants Information Retrieval and Question Answering: Taghi Aliyev, IBM Meeting 4

Natural Language Processing Natural Language Processing Models and Frameworks Models being tested: QANet DSSM (Deep Semantic Similarity Models) Recently released: BERT (Bidirectional Encoder Representations from Transformers) Framework to host the models: RASA Taghi Aliyev, IBM Meeting 5

Natural Language Processing Natural Language Processing Models QANet; Combining local conv with global self-attention Taghi Aliyev, IBM Meeting 6

Natural Language Processing Natural Language Processing Model DSSM; Deep Semantic Similarity Model Taghi Aliyev, IBM Meeting 7

Natural Language Processing Natural Language Processing Hosting Tool RASA; Open Source tools for contextual AI Assistants Python-based tool Allows for custom actions Easing the integration of pre-trained models Taghi Aliyev, IBM Meeting 8

Natural Language Processing Natural Language Processing Holding the models accountable and explainability Understanding the reasoning and decision-making is crucial Not very straight-forward for deep neural networks More relevant for a conversational bot Holding the model responsible when leads to accidents Ability to back trace the effects and the outcome Initial test case: TwinsUK with KCL for feature extraction in heritability studies Pre-trained CNN Taghi Aliyev, IBM Meeting 9

Natural Language Processing Natural Language Processing Deconvolutional Neural Networks Taghi Aliyev, IBM Meeting 10

Deconvolution Deconvolution Some initial results Taghi Aliyev, IBM Meeting 11

Deconvolution Deconvolution Perturbation on input image and correlation Results of initial tests on 2 twins With 2 different ways to compute correlations Results on 2 twins Taghi Aliyev, IBM Meeting 12

Deconvolution Deconvolution Where do we stand now? Last touches for the convolutional neural network Next: Generalization to different network architypes Especially for the textual cases Not an investigated problem Even more true in Medical Informatics Taghi Aliyev, IBM Meeting 13

Natural Language Processing Natural Language Processing Application Areas and Use Cases Public/Social GENIAL, Geneva Responsive City Camp Research SQuAD 2.0 Challenge Vignette extraction and analysis Education Training tools/Personal Assistant Still looking for partners and use cases Taghi Aliyev, IBM Meeting 14

Conclusion Conclusion Deconvolution: An interesting idea that can incorporated to the platform to provide insights Conversational bots: BERT proposes a generic and interesting approach DSSM and QANet are proven to be of decent quality Improvements are still required Use Cases: GENIAL case being presented upcoming Monday at AMLD Has interest of Canton of Geneva and a dedicated testing group Taghi Aliyev, IBM Meeting 15

Advancements in Natural Language Processing for Scientific Research

Download Presentation

Presentation Transcript

Related

More Related Content