Requirements for Semantic Biobanks and Global Biobank Data Retrieval
Explore the critical aspects of semantic interoperability in biobanking, highlighting the need for formal ontologies, comprehensive annotations, and model of meaning data. The (Generalized) Biomedical Retrieval Scenario underscores the importance of effective resource retrieval based on content-based semantic annotations and access regulations. Analogies with global bibliographic databases offer insights into managing heterogeneous resources and ensuring local access conditions. Gain a deeper understanding of semantic representation in biobanks through global biobank data retrieval concepts.
- Biobanks
- Semantic interoperability
- Global biobank data retrieval
- Formal ontologies
- Content-based annotations
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Requirements for Semantic Biobanks Andr Q ANDRADEa,b,, Markus KREUZTHALERb, Janna HASTINGSd,e, Maria KRESTYANINOVAf,g, Stefan SCHULZb,c aSchool of Information Science, Federal University of Minas Gerais, Brazil bMedical University of Graz, Austria, cUniversity Medical Center Freiburg, Germany dEuropean Bioinformatics Institute, Hinxton, UK;eUniversity of Geneva, Switzerland fHelsinki University, Finland, gUniquer, Lausanne, Switzerland
Semantic Biobanks Semantic interoperability: systems exchange exchange data + meaning Blood, tissue sampling for research Samples from several biobanks needed for retrieving data for a specific research question Formal Ontologies provide unambiguous descriptions of what is universally true for all objects of a certain type Increasing number of biomedical vocabularies are ontology based (OBO Foundry, SNOMED CT ) Comprehensive annotations with lab data and clinical data Model of Meaning Data
(Generalized) Biomedical Retrieval Scenario Retrieval: Distribution of heterogeneous resources of interest Most retrieval scenarios recall-oriented Resources used by multiple researchers over the world for multiple purposes Effective retrieval depends on querying resource metadata Provenance information Content-based semantic annotations (structured vocabulary) Access regulations Does this sound familiar?
Analogy Global bibliographic database Resources: publications from different publishers Annotations: Bibliographic data Abstract Semantic representation (MeSH) on paper content Local access conditions to the full resource apply
Analogy Biobank Broker Global bibliographic database Global biobank sample database Resources: publications from different publishers Resources: biological specimens (blood, tissue, ) Annotations: Annotations: Bibliographic data Sample information (staining etc ) Abstract Semantic representation of both lab and selected patient related information (Information models / ontologies) Local access conditions to the full resource apply Semantic representation (MeSH) of paper content Local access conditions to the full resource apply
Data resources for biobanking Sample related information: Type of sample Preparation of sample Time Storage information Physical location Associated information, lab data, genotype, Donor related information: Demographic data Phenotype data Time indexed clinical data (EHR extracts) 1960 1970 1980 1990 2000 2010 Increment of relevant donor related information after samples are taken
Centralized broker for biobanking information + + Biobank EHR Biobank + * + EHR * + Biobank EHR * + + Biobank EHR * +
Centralized broker for biobanking information + + Biobank EHR Biobank + * + EHR * + Biobank EHR * + + Biobank EHR * +
Centralized broker for biobanking information + + Biobank EHR Biobank + * + EHR * + Biobank EHR * + + Biobank EHR * +
Language for semantic annotations of biobank data Formal ontologies Precise, logical descriptions of annotations and queries High expressiveness through compositionality OWL-DL: Semantic Web Standard for description logics: allows to formulate axioms of what is universally true of all instances of a kind Specific components Ground axioms provided by an upper level ontology (BioTop) Set of disjoint upper level categories and relations, together with related constraints Ontological description of domain: SNOMED CT, OBO Foundry
Description logics representation and retrieval retrieve all gastric mucosa samples from before 2003 of patients who had cancer of stomach after 2008 retrieves Representation language: OWL DL Editor: Prot g 4.2. Reasoner: HermiT
Requirements Formal representations Ontological representation of information models and terminologies Ontological representation of data about specimens Joint, universally used clinical terminology Expressive and stable upper level ontologies (+ ontological relations) Scope and granularity of EHR extract of interest for biobank related queries Specification of structure and function of central repository Steps for information translation from legacy systems Mappings Interfaces Update policies
Challenges Prototypical status of DL reasoners and editor Performance problems with expressive ontologies Modularization of large clinical terminologies in response to data and query under scrutiny Organization of Central repository Local mappings / translations Logistics (samples) Privacy and IP issues Business model
Thanks Andrade et al.: Requirements for Semantic Biobanks CAPES (Brazil) Programa de Doutorado no Pa s com Est gio no Exterior FP7 NoE SemanticHealthNet