Choosing Your FBLA Competitive Event
Below is a guide to selecting a competitive event that aligns with your interests and skills. From speaking in front of an audience to working on business projects, explore various categories like Chapter Events, Individual Events, and Individual or Team Events. Discover opportunities in coding, programming, business ethics, 3D animation, and more. Decide whether you prefer to work individually or as part of a team, and explore events suited for different grades. Get ready to showcase your talents and knowledge in the FBLA arena!
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Data Science Lecture 1
Exponential Increase in Data All human generated information up to 2003 was about 5 exabytes. Same amount of data was generated every 2 days in 2011 and would be every 10 min NOW.
Data is the New Oil World Economic Forum 2011 Data is the new oil." Coined in 2006 by Clive Huby, a British data commercialization entrepreneur, this now famous phrase was embraced by the World Economic Forum in a 2011 report, Data is just like crude oil. It s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. To create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.
What is Data Science? Fortune magazine Hot New Gig in Tech Hal Varian, Google s Chief Economist, NYT, 2009: Statistics: The next attractive job The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that s going to be a hugely important skill. Mike Driscoll, CEO of meta markets: Data science, as it's practiced, is a blend of Red-Bull-fueled hacking and espresso- inspired statistics. Data science is the civil engineering of data. Its acolytes possess a practical knowledge of tools & materials, coupled with a theoretical understanding of what's possible.
Data Science A Visual Definition Drew Conway s Data Science Venn Diagram
What do data scientists do? They need to find nuggets of truth in data and then explain it to the business leaders , Rchard Snee, EMC Data scientists tend to be hard scientists , particularly physicists, rather than computer science majors. Physicists have a strong mathematical background, computing skills, and come from a discipline in which survival depends on getting the most from the data. They have to think about the big picture, the big problem. DJ Patil, Chief Scientist at LinkedIn
Mike Driscolls three skills of data geeks 1) Statistics traditional analysis 2) Data Munging parsing, scraping, and formatting data 3) Visualization graphs, tools, etc.
Data Science Data Science refers to an emerging area of work concerned with the collection, preparation, analysis, visualization, management and preservation of large collections of information. An Introduction to Data Science by Jeffrey Stanton, Syracuse University, School of Information Studies.
Data Science A Definition Data Science is the science which uses computer science, statistics and machine learning, visualization and human-computer interactions to collect, clean, integrate, analyze, visualize, interact with data to create data products. Service Change Converting new data insights into (often small) changes to business processes Data Science Applying advanced statistical tools to existing data to generate new insights Smarter Work More efficient and effective use of staff and resources
Data Scientist A data scientist is someone who can obtain, scrub, explore, model and interpret data, blending hacking, statistics and machine learning. Data scientists not only are adept at working with data, but appreciate data itself as a first-class product. Hilary Mason, chief scientist at bit.ly data wrangling data jujitsu data munging
Three types of tasks 1) Preparing to run a model Gathering, cleaning, integrating, restructuring, transforming, loading, filtering, deleting, combining, merging, verifying, extracting, shaping, massaging. 2) Running the model 3)Communicating the results
Data Science is about Data Products Data-driven apps (Mike Loukides) Spellchecker Machine Translator Interactive visualizations Google flu application Global Burden of Disease Online Databases Enterprise data warehouse Sloan Digital Sky Survey Data science is about building data products, not just answering questions Data products empower others to use the data. May help communicate your results (e.g., Nate Silver s maps) May empower others to do their own analysis (e.g., Global Burden of Disease) Goal of Data Science: Turn data into data products.
Types of data science work Data science tends to fall into three broad categories: Investigating aggregating and inspecting data to get basic insights on what is currently happening Simple Predicting taking the data and using it to understand what will happen in the future Optimizing using the data to choose what the best choice of actions will be Complex
Distinguishing Data Science from... Business Intelligence and Data Warehouse Statistics Data(base) Management Visualization Machine Learning Data Mining
What is data science? What is data science? Deals with both structured and unstructured data. Associated with the cleansing, preparation and final analysis of data. Combines the programming, logical reasoning, mathematics and statistics. Capturesdata in the most ingenious ways and encourages the ability of looking at things with a different perspective. Cleanses, prepares and aligns the data. An umbrellaof several techniques that are used forextracting the information and the insights of data. Data scientists are responsible for creating the data products and several other data based applications that deal with data in such a way that conventional systems are unable to do.
What is data mining? What is data mining? It is process of gathering information from huge databases that was and then using that information to previously incomprehensible and unknown to make relevant business decisions. Set of various methods that are used in the process of knowledge discovery for distinguishing the relationships and patterns that were previously unknown. Data mining is a mergence of various other fields like artificial intelligence, pattern recognition, visualization of data, machine learning, statistical studies and so on. Primary goal: To extract information from various sets of data in an attempt to transform it in proper and understandable structures for eventual use. A process which is used by data scientists and machine learning enthusiasts to convert large sets of data into something more usable.
What is machine learning? What is machine learning? Machine learning is responsible for providing computers the ability to learn about newer data sets without being programmed via an explicit source. Machine learning and data mining follow the relatively same process. But! Machine learning follows the method of data analysis which is responsible for automating the model building in an analytical way. It uses algorithms that iteratively gain knowledge from data and in this process; it lets computers find the apparently hidden insights without any help from an external program.
What is the difference between these three terms? What is the difference between these three terms? Data scientists are responsible for coming up with data centric products and applications that handle data in a way which conventional systems cannot. The process of data science is much more focused on the technical abilities of handling any type of data. Unlike data mining and data machine learning it is responsible for assessing the impact of data in a specific product or organization. Data science focuses on the science of data, data mining deals with the process of discovering newer patterns in big data sets. It might be apparently similar to machine learning, because it categorizes algorithms. However, unlike machine learning, algorithms are only a part of data mining. In machine learning, algorithms are used for gaining knowledge from data sets. However, in data mining algorithms are only combined that too as the part of a process. Unlike machine learning it does not completely focus on algorithms.
Data Science Data Science is a field of study which includes everything from Big Data Analytics, Data Mining, Visualization, Mathematics, and Statistics. Data Science has been referred to as the fourth paradigm of Science. (the other three being Theoretical, Empirical and Computational). Academia often conduct exclusive research in Data Science. Predictive Modeling, Data
Key Differences Between Data Science Vs Data Mining Data Mining is an activity which is a part of a broader Knowledge Discovery in Databases (KDD) Process while Data Science is a field of study just like Applied Mathematics or Computer Science. Data Science is thought to be broader in scope while Data Mining is considered narrower. Some activities of Data Mining such as statistical analysis, writing data flows and pattern recognition can intersect with Data Science. Hence, Data Mining becomes a subset of Data Science. Machine Learning in Data Mining is used more in pattern recognition while in Data Science it has a more general use.
Databases and Data Science Databases and Data Science Databases Data Science Data Value Precious Cheap Data Volume Modest Massive Examples Bank records, Personnel records, Census, Medical records Online clicks, GPS logs, Tweets, Building sensor readings Priorities Consistency, Error recovery, Auditability Speed, Availability, Query richness Structured Strongly (Schema) Weakly or none (Text) Properties Transactions, ACID* CAP* theorem (2/3), eventual consistency Realizations SQL NoSQL: MongoDB, CouchDB, Hbase, Cassandra, Riak, Memcached, Apache River, ACID = Atomicity, Consistency, Isolation and Durability CAP = Consistency, Availability, Partition Tolerance
Business Intelligence and Data Science Business Intelligence Data Science Querying the past Querying the past present and future
Machine Learning and Data Science Machine Learning and Data Science Data Science Machine Learning Explore many models, build and tune hybrids Develop new (individual) models Prove mathematical properties of models Understand empirical properties of models Develop/use tools that can handle massive datasets Improve/validate on a few, relatively clean, small datasets Publish a paper Take action!
Data Scientists I worry that the Data Scientist role is like the mythical webmaster of the 90s: master of all trades. -- Aaron Kimball, CTO Wibidata
What data science tells me: If you re a DBA, you need to learn to deal with unstructured data If you re a statistician, you need to learn to deal with data that does not fit in memory If you re a software engineer, you need to learn statistical modeling and how to communicate results. If you re a business analyst, you need to learn about nd tradeoffs at scale