Introduction to Big Data Analysis - National Taipei University Course Overview
This course at National Taipei University delves into fundamental concepts, research issues, and practical applications of Big Data Analysis. Taught by Dr. Min-Yuh Day, the syllabus covers topics such as AI, machine learning, deep learning, and industry practices related to big data analysis. Students will gain hands-on experience in Python and SAS Viya while exploring the role of data science in analyzing large datasets. The course emphasizes understanding core competencies in information technology and system development. Join this elective course to enhance your skills in data mining, artificial intelligence, and financial technology within the context of big data analytics.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Big Data Analysis Introduction to Big Data Analysis 1122BDA01 MBA, IM, NTPU (M6031) (Spring 2024) Tue 2, 3, 4 (9:10-12:00) (B3F17) Min-Yuh Day, Ph.D, Associate Professor https://meet.google.com/ paj-zhhj-mya Institute of Information Management, National Taipei University https://web.ntpu.edu.tw/~myday 1 2024-02-20
Min-Yuh Day, Ph.D. Associate Professor, Information Management, NTPU Visiting Scholar, IIS, Academia Sinica Ph.D., Information Management, NTU Director, Intelligent Financial Innovation Technology, IFIT Lab, IM, NTPU Associate Director, Fintech and Green Finance Center, NTPU Artificial Intelligence, Financial Technology, Big Data Analytics, Data Mining and Text Mining, Electronic Commerce 2
Course Syllabus National Taipei University Academic Year 112, 2nd Semester (Spring 2024) Course Title: Big Data Analysis Instructor: Min-Yuh Day Course Class: MBA, IM, NTPU (3 Credits, Elective) Details In-Class and Distance Learning EMI Course (3 Credits, Elective, One Semester) (M6031) Time & Place: Tue, 2, 3, 4, (9:10-12:00) (B3F17) Google Meet: https://meet.google.com/paj-zhhj-mya https://meet.google.com/ paj-zhhj-mya 3
Course Objectives 1. Understand the fundamental concepts and research issues of Big Data Analysis. 2. Equip with Hands-on practices of Big Data Analysis. 3. Conduct information systems research in the context of Big Data Analysis. 4
Course Outline This course introduces the fundamental concepts, research issues, and hands-on practices of Big Data Analysis. Topics include: 1. Introduction to Big Data Analysis 2. AI, Data Science and Big Data Analysis 3. Foundations of Big Data Analysis in Python 4. Machine Learning: SAS Viya, Data Preparation and Algorithm Selection 5. Machine Learning: Decision Trees and Ensembles of Trees 6. Machine Learning: Neural Networks (NN) and Support Vector Machines (SVM) 7. Machine Learning: Model Assessment and Deployment 8. Generative AI and Large Language Models (LLM) for Big Data Analysis 9. Deep Learning for ESG and Finance Big Data Analysis 10. Industry Practices of Big Data Analysis 11. Case Study on Big Data Analysis 5
Core Competence Exploring new knowledge in information technology, system development and application 80 % Internet marketing planning ability 10 % Thesis writing and independent research skills 10 % 6
Four Fundamental Qualities Professionalism Creative thinking and Problem-solving 40 % Comprehensive Integration 40 % Interpersonal Relationship Communication and Coordination 10 % Teamwork 5 % Ethics Honesty and Integrity 0 % Self-Esteem and Self-reflection 0 % International Vision Caring for Diversity 0 % Interdisciplinary Vision 5 % 7
College Learning Goals Ethics/Corporate Social Responsibility Global Knowledge/Awareness Communication Analytical and Critical Thinking 8
Department Learning Goals Information Technologies and System Development Capabilities Internet Marketing Management Capabilities Research capabilities 9
Syllabus Week Date Subject/Topics 1 2024/02/20 Introduction to Big Data Analysis 2 2024/02/27 AI, Data Science and Big Data Analysis 3 2024/03/05 Foundations of Big Data Analysis in Python 4 2024/03/12 Case Study on Big Data Analysis I 5 2024/03/19 Machine Learning: SAS Viya, Data Preparation and Algorithm Selection 6 2024/03/26 Machine Learning: Decision Trees and Ensembles of Trees 7 2024/04/02 Self-study 8 2024/04/09 Midterm Project Report 10
Syllabus Week Date Subject/Topics 9 2024/04/16 Machine Learning: Neural Networks (NN) and Support Vector Machines (SVM) 10 2024/04/23 Machine Learning: Model Assessment and Deployment 11 2024/04/30 Case Study on Big Data Analysis II 12 2024/05/07 Generative AI and Large Language Models (LLM) for Big Data Analysis 13 2024/05/14 Industry Practices of Big Data Analysis 14 2024/05/21 Deep Learning for ESG and Finance Big Data Analysis 15 2024/05/28 Final Project Report I 16 2024/06/04 Final Project Report II 11
Teaching Methods and Activities Lecture Discussion Practicum 12
Evaluation Methods Individual Presentation 60 % Group Presentation 10 % Case Report 10 % Class Participation 10 % Assignment 10 % 13
Required Texts Aur lien G ron (2022), Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 3rd Edition, O Reilly Media. 14 Source: https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1098125975
Reference Books Yves Hilpisch (2018), Python for Finance: Mastering Data-Driven Finance, 2nd Edition, O'Reilly Media. Yuxing Yan (2017), Python for Finance: Apply powerful finance models and quantitative analysis with Python, Second Edition, Packt Publishing 15
Other References SAS (2023), Machine Learning Using SAS Viya SAS (2023), 2023 SAS Machine Learning Academic Certification Program 16
Aurlien Gron (2022), Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 3rd Edition, O Reilly Media. 17 Source: https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1098125975
Yves Hilpisch (2018), Python for Finance: Mastering Data-Driven Finance, O'Reilly 18 Source: https://www.amazon.com/Python-Finance-Mastering-Data-Driven/dp/1492024333
Yuxing Yan (2017), Python for Finance: Apply powerful finance models and quantitative analysis with Python, Second Edition, Packt Publishing 19 Source: https://www.amazon.com/Python-Finance-powerful-quantitative-analysis/dp/1787125696
Social Network Based Big Data Analysis and Applications, Lecture Notes in Social Networks, Mehmet Kaya, Jalal Kawash, Suheil Khoury, Min-Yuh Day, Springer International Publishing, 2018. 20 Source: https://www.amazon.com/Network-Analysis-Applications-Lecture-Networks/dp/3319781952
2023 SAS Machine Learning Academic Certification Program SAS Viya 21
SAS Leader in Analytics and AI About SAS SAS was founded in 1976 SAS has customers in nearly 150 countries 96 of the Top 100 of the 2017 Fortune 500 list are SAS customers or their affiliates. 12,170 total employees SAS is investing $1 billion in artificial intelligence (AI) through software innovation, education, expert services and more. For the 16th consecutive year, Gartner has positioned SAS as a Leader in the Magic Quadrant for Data Quality Solutions. SAS ranks number one for market share, according to the IDC report 22
2024 SAS Machine Learning Academic Certification Program Why SAS Certification? SAS IS AMONG THE TOP 10 MOST IMPORTANT BIG DATA AND ANALYTICS CERTIFICATIONS IN 2021. CIO Magazine More than 219,000 219,000 jobs
2024 SAS Machine Learning Academic Certification Program SAS AI Certification Learning Path Main certification in this program 2 Forecasting and Optimization + Machine Learning Specialist 1 3 Certified ModelOps Specialist AI & Machine Learning Professional Candidates will be awarded the SAS Certified Professional: AI and Machine Learning once they have earned the SAS Certified Specialist: Machine Learning credential, the SAS Certified Specialist: Forecasting and Optimization and the SAS Certified Specialist: Natural Language Processing and Computer Vision. Natural Language Processing and Computer Vision
2024 SAS Machine Learning Academic Certification Program Teaching and Learning Resource For Teacher For Student Access to SAS Viya Dataset Machine Learning Online Course (10h) Exam Preparation Online Training (6h) Access to SAS Viya Dataset Tutorial Instructor Materials https://www.sas.com/en_us/software/viya-for-learners.html
2024 SAS Machine Learning Academic Certification Program Teaching and Learning Resource 1. Online Courses 2. Exam Preparation Training Lesson 1: Introduction to SAS Visual Data Mining and Machine Learning Lesson 2: Machine Learning Algorithms Lesson 3: Ensemble Machine Learning Algorithms Lesson 4: Model Assessment and Implementation Lesson 5: Factorization Machines
2024 SAS Machine Learning Academic Certification Program Content 10-hour Machine Learning Online Courses 6-hour Exam Preparation Online Training Access to SAS Viya Practice Exam Academic Discount NT$ 2,700 Certification Exam
Big Data Analysis 28
AI, Big Data, Cloud Computing Evolution of Decision Support, Business Intelligence, and Analytics AI AI Cloud Computing Big Data DM BI Source: Ramesh Sharda, Dursun Delen, and Efraim Turban (2017), Business Intelligence, Analytics, and Data Science: A Managerial Perspective, 4th Edition, Pearson 29
Big Data 4 V 30 Source: https://www-01.ibm.com/software/data/bigdata/
AI, NLP, ML, DL 32 Source: Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta (2020), Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems, O'Reilly Media.
Generative AI 33 Source: Jeong, Cheonsu. "A Study on the Implementation of Generative AI Services Using an Enterprise Data-Based LLM Application Architecture." arXiv preprint arXiv:2309.01105 (2023).
Text Analytics and Text Mining 34 Source: Ramesh Sharda, Dursun Delen, and Efraim Turban (2017), Business Intelligence, Analytics, and Data Science: A Managerial Perspective, 4th Edition, Pearson
AI, ML, DL Artificial Intelligence (AI) Machine Learning (ML) Unsupervised Learning Supervised Learning Deep Learning (DL) CNN RNN LSTM GRU GAN Semi-supervised Learning Reinforcement Learning 35 Source: https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/deep_learning.html
Stephan Kudyba (2014), Big Data, Mining, and Analytics: Components of Strategic Decision Making, Auerbach Publications Source: http://www.amazon.com/gp/product/1466568704 36
Architecture of Big Data Analytics Big Data Analytics Applications Big Data Sources Big Data Transformation Big Data Platforms & Tools Middleware Queries * Internal Hadoop MapReduce Pig Hive Jaql Zookeeper Hbase Cassandra Oozie Avro Mahout Others Transformed Data Big Data Analytics Raw Data * External Extract Transform Load Reports * Multiple formats Data * Multiple locations OLAP Warehouse * Multiple applications Traditional Format CSV, Tables Data Mining 37 Source: Stephan Kudyba (2014), Big Data, Mining, and Analytics: Components of Strategic Decision Making, Auerbach Publications
Architecture of Big Data Analytics Big Data Analytics Applications Big Data Sources Big Data Transformation Big Data Platforms & Tools DataData Mining Middleware Queries * Internal Hadoop MapReduce Pig Hive Jaql Zookeeper Hbase Cassandra Oozie Avro Mahout Others Transformed Data Big Data Analytics Raw * External Extract Transform Load Big Data Analytics Reports * Multiple formats Data * Multiple locations OLAP Warehouse * Multiple applications Traditional Format CSV, Tables Data Mining Source: Stephan Kudyba (2014), Big Data, Mining, and Analytics: Components of Strategic Decision Making, Auerbach Publications Applications 38
Social Big Data Mining (Hiroshi Ishikawa, 2015) Source: http://www.amazon.com/Social-Data-Mining-Hiroshi-Ishikawa/dp/149871093X 39
Architecture for Social Big Data Mining (Hiroshi Ishikawa, 2015) Enabling Technologies Analysts Model Construction Explanation by Model Integrated analysis model Integrated analysis Conceptual Layer Construction and confirmation of individual hypothesis Description and execution of application-specific task Natural Language Processing Information Extraction Anomaly Detection Discovery of relationships among heterogeneous data Large-scale visualization Data Mining Application specific task Multivariate analysis Logical Layer Parallel distrusted processing Social Data Software Hardware Physical Layer 40 Source: Hiroshi Ishikawa (2015), Social Big Data Mining, CRC Press
Business Intelligence (BI) Infrastructure 41 Source: Kenneth C. Laudon & Jane P. Laudon (2014), Management Information Systems: Managing the Digital Firm, Thirteenth Edition, Pearson.
Data Warehouse Data Mining and Business Intelligence Increasing potential to support business decisions End User Decision Making Business Analyst Data Presentation Visualization Techniques Data Data Mining Information Discovery Analyst Data Exploration Statistical Summary, Querying, and Reporting Data Preprocessing/Integration, Data Warehouses DBA Data Sources Paper, Files, Web documents, Scientific experiments, Database Systems 42 Source: Jiawei Han and Micheline Kamber (2006), Data Mining: Concepts and Techniques, Second Edition, Elsevier
The Evolution of BI Capabilities Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 43
Three Types of Analytics Source: Ramesh Sharda, Dursun Delen, and Efraim Turban (2017), Business Intelligence, Analytics, and Data Science: A Managerial Perspective, 4th Edition, Pearson 44
Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners, Jared Dean, Wiley, 2014. 45 Source: https://www.amazon.com/Data-Mining-Machine-Learning-Practitioners/dp/1118618041
Data Mining at the Intersection of Many Disciplines Artificial Intelligence Machine Learning Pattern Recognition Statistics DATA MINING Mathematical Modeling Databases Management Science & Information Systems Source: Turban et al. (2011), Decision Support and Business Intelligence Systems 46
47 Source: http://www.amazon.com/Big-Data-Analytics-Turning-Money/dp/1118147596
48 Source: http://www.amazon.com/Big-Data-Revolution-Transform-Mayer-Schonberger/dp/B00D81X2YE
49 Source: https://www.thalesgroup.com/en/worldwide/big-data/big-data-big-analytics-visual-analytics-what-does-it-all-mean
Big Data with Hadoop Architecture 50 Source: https://software.intel.com/sites/default/files/article/402274/etl-big-data-with-hadoop.pdf