Essential Insights into Machine Learning and Data Science Practices
Delve into a comprehensive exploration of machine learning and data science covering topics such as AI pitfalls, effective data management strategies, algorithm selection, and the importance of transparency and action in AI applications. Gain valuable insights into the nuances of classification, regression, and clustering techniques. Learn to navigate the complexities of AI implementation and decision-making processes while avoiding common pitfalls and misconceptions.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Machine Learning & Data Science Sylvia Unwin Faculty, Program Chair Assistant Dean, iBIT
Machine Learning Attended TDWI in Oct 2017 Focus on Machine Learning, Data Science, Python, AI Started with a catchy opening speech BS-Free AI For Business Top 5 BS List
AI What s the BS? AI is first According to the speaker, doesn t solve a necessary real-world problem Startups (investments) in scaling AI Doesn t show ROI without promise of more, perfect and better data
Avoid Big data problem will only provide a small data solution Thinking more data will solve the problem (if perfect data, will work) Not defining what is the problem? Be specific (reduce waste by 10x) Know who owns the data Avoid scaling too quickly
Avoid No Black boxes Requires trust, then must have transparency No technical explanations (too many acronyms), no invented scores Inaction nothing will happen, if no action is taken
Why AI Be aware of your focus Understand the data (common theme) Scalability Take action
Machine Learning using Python Machine Learning: Continuously improving models Cost reduction Classification of space data Definitions of various models Regression - Pattern Recognizer Classification Clustering
Classification Supervised Trained with data, fully labeled, user involved with training Unsupervised No training data, groupings of similar attributes (characteristics), computer uses techniques such as clustering Discrete vs Continuous values
Understand Which Algorithm to Use Categorical (Discreet) Classification Clustering Continuous Supervised Unsupervised Regression
Algorithms Logistic Regression Simple, large scale, can be parallelized Neural Networks Unstructured data, no limit to complexity, good on large datasets Decision Trees Easy to interpret, fast prediction, rules based
Evaluate Model All data available Split to training and testing data Run through the model Output Train model, measure performance
Examples Predict Price of houses Book recommendation Petal vs Sepal of Iris Walmart beer & diapers
Other Confusion Matrix Solve binary problem, how wrong Train/Test Cross validation; split data into slices, then have a different assessment and average it out More data or more model Build a learning curve
Jupyter Navigator Jupyter Notebook Examples in Python Not enough time
Data Visualization Know your audience Mechanism for feedback How to direct the focus Charts, images Develop a sense of storytelling Know your data Relationship to user Be creative
Data Science May be a data artist Problem & data = acceptable solution Storytelling Make the analytics tell a more focused story Don t undervalue hands-on experience Target something useful Analytics is AI
Robotics & AI Validated topics introduced Statistics Data Analytic techniques Data visualization Not all science, there is some art Python programming AI is first