Big Data Platforms: Meeting Report and Insights
The meeting report from the EGI-InSPIRE Big Data Platforms highlights presentations on various topics including DBSCAN algorithm, Hecuba integration with COMPSs, cloud infrastructure development, and Hadoop clusters instantiation. The outcomes emphasize the interest in further discussions, opportunities for collaboration, challenges with Hadoop implementation in FedCloud, and initiatives to expand existing projects to additional sites.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
EGI-InSPIRE Big Data Platforms (Meeting report from 15/Jan/2015) Slides and meeting notes: https://indico.egi.eu/indico/conferenceDispla y.py?confId=2425 Gergely Sipos gergely.sipos@egi.eu 1 10/3/2024 www.egi.eu www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE RI-261323
Bj rn Hagemeier - Big Data at JSC DE NASA use case DBSCAN algorithm UNICORE integration with Hadoop Distr. File System Danielle Lezzi - Hecuba and integration with COMPSs ES HECUBA, Dataclay: tools to integrate with non-relational DBs; COMPSs integrated with Cassandra (pers. storage) Esteban Freire - Curent cloud infras. at CESGA ES Developing a web portal through which users can instantiate Hadoop clusters on OpenNebula sites Giacinto Donvito - Big data @ INFN Bari IT Two nat. projects on OS IaaS for public admin. and research Hadoop with Sahara: Data-intensive cluster on top of OpenStack NoSQL DB on dynamic cluster instantiated with Heat Test Hadoop across multiple sites by using its core features, such as hierarchical storage Ignacio Blanquer - UPV- IBM s Big data observatory & Hadoop infrastructure management ES IBM-UPV 'Big Data Observatory' With IBM InfoSphere BigInsights GRYCAP: Iaas Management framework - A set of services for managing VMs and data on IaaS clouds Automatic configuration and recontextualization service: www.grycap.upv.es/im Creation of elastic virtual clusters on top of both public and on- premise IaaS providers: www.grycap.upv.es/ec3 Viet Tran Big data activities at IISAS SK Developed several products related to Hadoops big data applications (RDB2Onto, ACoMA, EMBET, RIDAR, WEBCRAWLER) Involved in 5 current projects in this area (4 national) Hadoop development and production clusters EGI-InSPIRE RI-261323 + Jens Jensen (STFC, UK), Kostas Koumantaros (GRNET, GR), Mario David (LIP, PT), Ruben Valles (BIFI, ES) www.egi.eu
Outcomes 1. Need and interest in discussing the topic further Case studies, tools, projects, experiences Establish a new scenario in the FedCloud Task Force? Opportunities to expand/adapt local initiatives to EGI: UNICORE-HDFS service on EGI sites COMPSs with No-SQL databases Hadoop gateways (on an EGI Hadoop VO?) CESGA portal INFN Bari OpenStack-Sahara-Heat experiences Expand the IBM-UPV 'Big Data Observatory to additional sites GRYCAP services - alternative interface to access the FedCloud Sharing IISAS Hadoop products via AppDB Challenges with Hadoop Hadoop's heavy use of disk and network I/O is the biggest challenge to host Hadoop in the FedCloud FedCloud would fit to small and medium-size Hadoop jobs where the data set is already pre-deployed in HDFS 2. 3. 3 www.egi.eu EGI-InSPIRE RI-261323
EGI-InSPIRE Thank you! 4 10/3/2024 www.egi.eu www.egi.eu EGI-InSPIRE RI-261323 EGI-InSPIRE RI-261323