Understanding Python Collections: Lists, Tuples, and Dictionaries
Data structures in Python such as lists, tuples, and dictionaries play a crucial role in storing and organizing data. Lists allow storing a collection of diverse data items, tuples provide immutability, and dictionaries facilitate key-value pair storage. Learn how to declare, initialize, access, and utilize these collection types efficiently in Python.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Python Collections ISYS 350
Collections: Data Structures to Store Many Data Items List is a collection which is ordered (stored in the order of entering the list and indexed) and changeable(data can be modified). Allows duplicate members. Tuple is a collection which is ordered and unchangeable(data not allowed to change). Allows duplicate members. Typically for storing constants. Dictionary is a collection of key:value pairs which is unordered, changeable and indexed. No duplicate members.
List A collection allows you to store a group of data items. Data items do not require to be the same data type. For example, we can have a list with strings and numbers. Data items can be easily accessed using index. Note: Python itself does not have array as in other languages. But array is available in Python libraries such as Pandas and Numpy.
Declaring a list In Python lists are written with square brackets, and items separated by commas. mylist = [item1, item2, ...] fruitList = ["apple", "orange", "banana"] Data items may have different data types: employeeList=["peter",30,"paul",35,"mary",28] Print the list: print(employeeList)
Initializing list with a repetition operator, * scores = [0] * 5 # test scores = [0, 0, 0, 0, 0]
Members are indexed with a positive index and a negative index, and can be accessed by either index temps = [48.0, 30.5, 20.2, 100.0, 42.0] Its positive and negative index values temps[0] temps[-5] # returns 48.0 temps[1] temps[-4] # returns 30.5 temps[2] temps[-3] # returns 20.2 temps[3] temps[-2] # returns 100.0 temps[4] temps[-1] # returns 42.0
List len() function The number of elements in a list fruitList = ["apple", "orange", "banana"] print(len(fruitList)) # Show 3 Note: Given any list, the last member has index len(list)-1. >>> fruitList = ["apple", "orange", "banana"] >>> lastMember=fruitList[len(fruitList)-1] >>> print (lastMember) banana
Accessing list elements with a for loop https://www.w3schools.com/python/python_for_loops.asp A for loop is used for iterating over a collection. In other language, this is called a foreach loop. Syntax: for item in list: statements Example: fruitList = ["apple", "orange", "banana"] for x in fruitList: print(x)
Accessing list elements using index fruitList = ["apple", "orange", "banana"] for i in range(len(fruitList)): print(fruitList[i]) Note: using range(len(fruitList) to generate index from 0 to len(fruitList) 1.
Example: Compute the sum and average of numbers in a list myGPA = [2.5, 3.2, 3.4, 2.9, 3.6] sumGPA=0 for item in myGPA: sumGPA += item avgGPA = sumGPA / len(myGPA) print ("Average GPA is: " + str(avgGPA))
Python list aggregation functions sum() ,len(), max() and min() myGPA = [2.5, 3.2, 3.4, 2.9, 3.6] sumGPA=sum(myGPA) avgGPA = sumGPA / len(myGPA) best=max(myGPA) lowest=min(myGPA) print ("Average GPA is: " + str(avgGPA)) print ('The best score is {} and the lowest score is {}'.format(best,lowest))
Future Value Table with a list of years presentValue=float(input("Enter present value: ")) rate=float(input("Enter interest rate: ")) print() print("\tFuture Value Table") print() print("\tPresent value:\t" +"${:,.2f}".format(presentValue)) print("\tInterest rate:\t" +"{:.2%}".format(rate)) print() print("\tYear \t\t Future Value") print("\t__________ \t _______") print() yearlist=[5,7,8,10,12,20] for year in yearlist: futureValue=presentValue*(1+rate)**year print("\t" + str(year)+ "\t\t " + "${:,.2f}".format(futureValue)) print() Note: The years are not separated by fixed interval.
You may declare an empty list and add members later using list append method >>> mylist=[] >>> mylist.append(5) >>> mylist.append(10) >>> print (mylist) [5, 10] >>>
List methods for modifying a list Example: fruitList = ["apple", "orange", "banana"] append(item): Add an item to the end of list fruitList.append("cherry") ['apple', 'orange', 'banana', 'cherry'] insert(index, item): Add an item in index position fruitList.insert(2,"mango") ['apple', 'orange', 'mango', 'banana', 'cherry'] remove(item): Remove an item fruitList.remove('cherry') ['apple', 'orange', 'mango', 'banana'] pop(index): remove the item at the index fruitList.pop(1)
Note: The remove() function only removes the first occurrence of the element. fruitList = ["apple", "orange", "banana","orange"] fruitList.remove("orange") print(fruitList) ['apple', 'banana', 'orange']
Range of Index fruitList=['apple', 'orange', 'mango', 'banana', 'cherry'] We can specify a range of indexes by specifying where to start and where to end the range (the first item is position 0) print(thislist[2:4]) ['mango', 'banana'] #Note that the item in position 4 is NOT included print(fruitList[:4]) ['apple', 'orange', 'mango', 'banana'] #Note that the item in position 4 is NOT included print(fruitList[2:]) ['mango', 'banana', 'cherry'] #Note that the item in position 2 is included
Range of Negative Indexes fruitList=['apple', 'orange', 'mango', 'banana', 'cherry'] >>> fruitList[-5] 'apple' >>> fruitList[-1] 'cherry' >>> fruitList[-3:-1] ['mango', 'banana']
Sort a list The sort() method sorts the list ascending by default. Example: fruitList=['apple', 'orange', 'mango', 'banana', 'cherry'] fruitList.sort() print(fruitList) ['apple', 'banana', 'cherry', 'mango', 'orange'] Descending order: fruitList.sort(reverse=True) ['orange', 'mango', 'cherry', 'banana', 'apple']
Note: After sorting, the original lists order is lost permanently How to make a copy of the original list? Use the copy() method: listOld=list.copy() fruitList=['apple', 'orange', 'mango', 'banana', 'cherry'] fruitListOld=fruitList ### not working fruitListOldCopy=fruitList.copy() ### make a copy fruitList.sort() print(fruitList) print(fruitListOld) print(fruitListOldCopy)
Example: Three exam scores are stored in a list. Compute the Weighted Average of the three exams using this formula: 60%*highest score +30%*2nd highest score +10%*lowest score examScores=[79,85,65] examScores.sort() weightedAvg=0.6*examScores[2]+0.3*examScores[1]+0.1*examScores[0] print("The weighted avg is: "+str(weightedAvg))
Using sum, max and min to solve the same problem without sorting examScores=[79,85,65] weightedAvg=0.6*max(examScores)+0.3*(sum(examScores)- max(examScores)-min(examScores))+0.1*min(examScores) print("The weighted avg is: "+str(weightedAvg)) Note: This solution works for 3 exams.
We can use the keyword in to search if an item exists in a list fruitList = ["apple", "orange", "banana"] if "apple" in fruitList: print('in') else: print('not in') if "kiwi" in fruitList: print('in') else: print('not in')
A Python string is stored as a list with each character as a member of the list Example: The indices of the characters in the string coding are labeled as: Index 0 1 2 3 4 5 Character c o d i n g Example: David Chao Index 0 1 2 3 4 5 6 7 8 9 Character D a v i d C h a o
Search and count a string To determine if a specified item is present in a list use the in keyword. count(): Return the number of times a value appears in the string.
Example of using in and count customerName = input('Enter customer name: ').upper() searchChar=input('Enter search character: ').upper() if searchChar in customerName: countChar=customerName.count(searchChar) print("Yes, there are {} character {} in the name".format(countChar,searchChar)) else: print('{} charactr not in the name'.format(searchChar))
Search not limited to single character we can search a string customerName = input('Enter customer name: ').upper() searchStr=input('Enter search string: ').upper() if searchStr in customerName: countStr=customerName.count(searchStr) print("Yes, there are {} {} in the name".format(countStr,searchStr)) else: print('{} string not in the name'.format(searchStr))
Count how many times a word found in a paragraph using string s count() function paragraph = input('Enter a paragraph: ').upper() searchWord=input('Enter a search word: ').upper() if searchWord in paragraph: countWord=paragraph.count(searchWord) print("There are {} {} in the paragraph.".format(countWord,searchWord)) else: print('{} not in the paragraph'.format(searchWord))
Using Split() method for word frequencies analysis Split a string into a list where each word is a list item. txt = "This is a test, this is only a test." txtList = txt.split() print(txtList) ['This', 'is', 'a', 'test,', 'this', 'is', 'only', 'a', 'test.'] Note : test, with , and test. with .
Count the number of words in a string: len(list) txt = "This is a test, this is only a test." wordList = txt.split() print("There are: " + str(len(wordList)) + " words in the text")
Count how many times a word found in a paragraph using split method and for loop paragraph = input('Enter a paragraph: ').upper() searchWord=input('Enter a search word: ').upper() wordList=paragraph.split() wordCount=0 for word in wordList: if searchWord in word: ### I use in , not == , why? wordCount+=1 print("There are {} {} in the paragraph.".format(wordCount,searchWord)) Enter a paragraph: this is a test, this is only a test Enter a search word: test There are 2 TEST in the paragraph.
A list of objects import employeeClass ### employeeClass module defines employee class myEmp1=employeeClass.employee('e1','peter','m','07/04/2020',7000.00) myEmp2=employeeClass.employee('e2','paul','m','12/25/2018',8000.00) myEmp3=employeeClass.employee('e3','mary','f','03/08/2018',7500.00) empList=[myEmp1,myEmp2,myEmp3] ## list of objects for emp in empList: empOut="empID:{} Name:{} Sex:{} Salary:{} Hire date:{}" print(empOut.format(emp.empID,emp.ename,emp.sex,emp.salary,emp.hireDate))
Using index to retrieve list members Example: empList=[myEmp1,myEmp2,myEmp3] Each member has an index: 0, I, 2 Len(empList] returns a value of 3 Range(len(empList)) will return 0,1,2
Using index to retrieve members import employeeClass ### employeeClass module defines employee class myEmp1=employeeClass.employee('e1','peter','m','07/04/2020',7000.00) myEmp2=employeeClass.employee('e2','paul','m','12/25/2018',8000.00) myEmp3=employeeClass.employee('e3','mary','f','03/08/2018',7500.00) empList=[myEmp1,myEmp2,myEmp3] for i in range(len(empList)): empOut="empID:{} Name:{} Sex:{} Salary:{} Hire date:{}" print(empOut.format(empList[i].empID,empList[i].ename,empList[i].sex,\ empList[i].salary,empList[i].hireDate)) # \ means line continuation Note: Use the . notation to access a property: empList[i].empID
Compute employee average salary import employeeClass ### employeeClass module defines employee class myEmp1=employeeClass.employee('e1','peter','m','07/04/2020',7000.00) myEmp2=employeeClass.employee('e2','paul','m','12/25/2018',8000.00) myEmp3=employeeClass.employee('e3','mary','f','03/08/2018',7500.00) empList=[myEmp1,myEmp2,myEmp3] ## list of objects sumSalary=0 for emp in empList: sumSalary+=emp.salary avgSalary=sumSalary/len(empList) print('We have {} employees, and the averge salary is: ${:,.2f}.'\ .format(len(empList),avgSalary))
Compute employee average salary using index import employeeClass ### employeeClass module defines employee class myEmp1=employeeClass.employee('e1','peter','m','07/04/2020',7000.00) myEmp2=employeeClass.employee('e2','paul','m','12/25/2018',8000.00) myEmp3=employeeClass.employee('e3','mary','f','03/08/2018',7500.00) empList=[myEmp1,myEmp2,myEmp3] ## list of objects sumSalary=0 for i in range(len(empList)): sumSalary+=empList[i].salary avgSalary=sumSalary/len(empList) print('We have {} employees, and the averge salary is: ${:,.2f}.'\ .format(len(empList),avgSalary))
We may pass a list to a function import employeeClass ### employeeClass module defines employee class def avgSalary(empList): sumSalary=0 for emp in empList: sumSalary+=emp.salary avgSalary=sumSalary/len(empList) return avgSalary myEmp1=employeeClass.employee('e1','peter','m','07/04/2020',7000.00) myEmp2=employeeClass.employee('e2','paul','m','12/25/2018',8000.00) myEmp3=employeeClass.employee('e3','mary','f','03/08/2018',7500.00) empList=[myEmp1,myEmp2,myEmp3] ## list of objects avgSalary=avgSalary(empList) print('We have {} employees, and the averge salary is: ${:,.2f}.'\ .format(len(empList),avgSalary))
Tuple A tuple is a collection which is ordered and unchangeable. In Python tuples are written with round brackets. thistuple = ("apple", "banana", "cherry") print(thistuple)
Dictionary A dictionary is a collection which is unordered, changeable and indexed. In Python dictionaries are written with curly brackets, and they have keys and values. Example: thisdict = { "brand": "Ford", "model": "Mustang", "year": 1964}
Accessing Items We can access the items of a dictionary by referring to its key name, inside square brackets. Example: x = thisdict["model"] >>> thisdict = { "brand": "Ford", "model": "Mustang", "year": 1964} >>> x = thisdict["model"] >>> print(x) Mustang >>>
Loop Through a Dictionary To retrieve all the keys: for x in thisdict: print(x) To retrieve all the values: for x in thisdict: print(thisdict[x]) Or: for x in thisdict.values(): print(x) To retrieve all key:value pairs: for x, y in thisdict.items(): print(x, y)
built-in class attribute __dict__ present all attributes of an object as a dictionary Note: Two underscores on each side import classModule myEmp=classModule.employee('e1','peter','m','07/04/2020',7000.00) print(myEmp.ename) myEmpDict=myEmp.__dict__ print(myEmpDict["ename"])
JSON Object Java Script Object Notation https://www.w3schools.com/js/js_json_intro.asp JSON object: A JSON object contains a set of key-value pairs separated by commas and enclosed within { and } characters. keys must be strings, written with double quotes: {"k1": "value", "k2": 10}