operating systems
Background The scientific method is a method of acquiring knowledge that has been the basis of science for many hundred years. The method contains a set of steps: 1. Make an observation 2. Ask a question 3. Form a hypothesis 4. Test the prediction 5. Draw conclusions 6. Publish results In the labs in this course, you will be using the scientific method to understand how different parts of the operating system function. Most of our labs in this course will work like this: - You will be given the observation, the question, or a hypothesis. - You will be asked to create and/or run an experiment. - You will need to collect data and draw conclusions about the data. - You will need to write the results in the form of a lab report. The communication of the results in the form of the lab report is the most important part. Do not think that running a program will result in a good grade on the labs. The background research and results analysis are the things that the instructor will look at closely. Even if your program has a fatal flaw or the experiment fails to capture the expected results, it is okay, but you must report what was found and find the source of your problems. INFT2201 Comparing Search Times for Lists and Sets in Python The following post was found on stackoverflow about the speed of using Lists in a Python program compared to a Set: Sets are significantly faster when it comes to determining if an object is present in the set (as in x in s) but are slower than lists when it comes to iterating over their contents. Further, in this lab we would also like to test the hypothesis: The amount of time required to determine if an object is in a list depends on the number of objects in the list, however the time required to determine if an object is in a set does not depend on the number of objects in the set. In this lab exercise you will need to conduct an experiment to justify or contradict this statement. The Programs Two Python programs have been provided for you that generate a large list/set and then repeatedly check to see if random values are found in the list/set. A timing calculation has already been embedded in the program. You may use this program and make modifications as required to perform the experiment. The modification will involve just changing the number of items in the data structures. The Experiment (Procedure) 1. Make sure that the number of elements in the list/set are the same. 2. Run each program several times with a fixed number of elements in the list/set and collect the results in an Excel spreadsheet. 3. Run each program several times with different numbers of items in the set and collect the data into the Excel spreadsheet (different sheet). The spreadsheet data you collected is called the "raw data". This does not need to be included in the lab report but put the data into a spreadsheet and submit the spreadsheet as a separate file. Spreadsheets are good for this because you can do things like statistical calculations, creating new columns and generating graphs. The Analysis Normally there will be no instructions on how to perform analysis. For this lab we will provide some suggestions to provide you an idea of how to proceed. Expect less of this in the future labs. 1. Compare the values from the various runs on each type of storage mechanism. Are they the same, are they different? Can you explain why you are seeing the same numbers or different numbers? Is there something you can do to make the timings more "compact" such as taking an average of the times? Remember that we are trying to consolidate all the data we collected into something more meaningful, Be careful about averaging too many things! Calculating the average times for all the searches in a list with different sized lists does provide you a number but that might not be helpful in this lab. 2. Compare the times required to find values in the list to the times required to find the values in the set when both structures are the same size. If the set is faster as expected, explain why referring to the information that you put into the theory section. If the results show no difference you will need to explain why that is and suggest about how you would determine what went wrong (maybe the statement is incorrect, maybe your experiment is invalid). 3. Compare the times required to find elements in the list/set when there are X elements where X is a variable that is changed several times. Picking the correct number of attempts and the size is part of the experiment that will not be given. You will need to try some variables. If you pick numbers are that are too close to each other (say 1,2,3, 4), you don't get enough variable to draw conclusions. Picking numbers that are really spread out (say 1,100000,1000000000 ) will result in very long running programs that take too long to measure. 4. Since the timing depends on the size of the structure, maybe it makes sense to create a graph with 2 lines with the number of elements along the x-axis and the amount of time for the y-axis. Lab Report Normally we will not be telling you exactly what goes into the lab report. Being able to take the experiment and draw conclusions and explain what is happening forms most of your grade for the labs. We will mention a couple of suggestions here but please refer to the lab report format document provided. You must use the lab report format that was mentioned in the lab report format document. We recommend starting by listing all the sections and completing them in any order that you want but suggest that the abstract and conclusion sections be left for the very end. The experimental design and procedure section, the analysis section and the theoretical background section are usually the easiest to write so these are good to start with. Procedure Section Remember that the equipment you are running the program on might influence the result. Be sure to mention the hardware configuration that might have an impact on your results. Do say why you think these items have an impact. Experimental Procedure Describe how you ran the program and how you got the measurements and how you collected the data. Include a copy of the program that you used but you do not need to show the program each time that you changed it (i.e. 1000 elements then 2000 elements) unless it was completely modified to compare two approaches. Putting line numbers on the image can help you make statements like "I changed the value of items on line 56 to 4000 and ran the experiment again". The output of one run is helpful but there is no reason to show the output from every time that you ran the program. Theoretical Background Since we are talking about lists and sets in Python, it would be good to provide the reader how these things are different. Since it is possible that you do not know anything about sets, you will need to perform some background research on what these are and then explain to the user how they work and how they compare to a list. Remember that the purpose of the lab has to do with performance so make sure to describe why one might be faster than the other. A statement that "sets are faster than lists" is not sufficient, you need to convince the person reading your report that you understand how finding objects in these two data structures work. Background The scientific method is a method of acquiring knowledge that has been the basis of science for many hundred years. The method contains a set of steps: 1. Make an observation 2. Ask a question 3. Form a hypothesis 4. Test the prediction 5. Draw conclusions 6. Publish results In the labs in this course, you will be using the scientific method to understand how different parts of the operating system function. Most of our labs in this course will work like this: - You will be given the observation, the question, or a hypothesis. - You will be asked to create and/or run an experiment. - You will need to collect data and draw conclusions about the data. - You will need to write the results in the form of a lab report. The communication of the results in the form of the lab report is the most important part. Do not think that running a program will result in a good grade on the labs. The background research and results analysis are the things that the instructor will look at closely. Even if your program has a fatal flaw or the experiment fails to capture the expected results, it is okay, but you must report what was found and find the source of your problems. INFT2201 Comparing Search Times for Lists and Sets in Python The following post was found on stackoverflow about the speed of using Lists in a Python program compared to a Set: Sets are significantly faster when it comes to determining if an object is present in the set (as in x in s) but are slower than lists when it comes to iterating over their contents. Further, in this lab we would also like to test the hypothesis: The amount of time required to determine if an object is in a list depends on the number of objects in the list, however the time required to determine if an object is in a set does not depend on the number of objects in the set. In this lab exercise you will need to conduct an experiment to justify or contradict this statement. The Programs Two Python programs have been provided for you that generate a large list/set and then repeatedly check to see if random values are found in the list/set. A timing calculation has already been embedded in the program. You may use this program and make modifications as required to perform the experiment. The modification will involve just changing the number of items in the data structures. The Experiment (Procedure) 1. Make sure that the number of elements in the list/set are the same. 2. Run each program several times with a fixed number of elements in the list/set and collect the results in an Excel spreadsheet. 3. Run each program several times with different numbers of items in the set and collect the data into the Excel spreadsheet (different sheet). The spreadsheet data you collected is called the "raw data". This does not need to be included in the lab report but put the data into a spreadsheet and submit the spreadsheet as a separate file. Spreadsheets are good for this because you can do things like statistical calculations, creating new columns and generating graphs. The Analysis Normally there will be no instructions on how to perform analysis. For this lab we will provide some suggestions to provide you an idea of how to proceed. Expect less of this in the future labs. 1. Compare the values from the various runs on each type of storage mechanism. Are they the same, are they different? Can you explain why you are seeing the same numbers or different numbers? Is there something you can do to make the timings more "compact" such as taking an average of the times? Remember that we are trying to consolidate all the data we collected into something more meaningful, Be careful about averaging too many things! Calculating the average times for all the searches in a list with different sized lists does provide you a number but that might not be helpful in this lab. 2. Compare the times required to find values in the list to the times required to find the values in the set when both structures are the same size. If the set is faster as expected, explain why referring to the information that you put into the theory section. If the results show no difference you will need to explain why that is and suggest about how you would determine what went wrong (maybe the statement is incorrect, maybe your experiment is invalid). 3. Compare the times required to find elements in the list/set when there are X elements where X is a variable that is changed several times. Picking the correct number of attempts and the size is part of the experiment that will not be given. You will need to try some variables. If you pick numbers are that are too close to each other (say 1,2,3, 4), you don't get enough variable to draw conclusions. Picking numbers that are really spread out (say 1,100000,1000000000 ) will result in very long running programs that take too long to measure. 4. Since the timing depends on the size of the structure, maybe it makes sense to create a graph with 2 lines with the number of elements along the x-axis and the amount of time for the y-axis. Lab Report Normally we will not be telling you exactly what goes into the lab report. Being able to take the experiment and draw conclusions and explain what is happening forms most of your grade for the labs. We will mention a couple of suggestions here but please refer to the lab report format document provided. You must use the lab report format that was mentioned in the lab report format document. We recommend starting by listing all the sections and completing them in any order that you want but suggest that the abstract and conclusion sections be left for the very end. The experimental design and procedure section, the analysis section and the theoretical background section are usually the easiest to write so these are good to start with. Procedure Section Remember that the equipment you are running the program on might influence the result. Be sure to mention the hardware configuration that might have an impact on your results. Do say why you think these items have an impact. Experimental Procedure Describe how you ran the program and how you got the measurements and how you collected the data. Include a copy of the program that you used but you do not need to show the program each time that you changed it (i.e. 1000 elements then 2000 elements) unless it was completely modified to compare two approaches. Putting line numbers on the image can help you make statements like "I changed the value of items on line 56 to 4000 and ran the experiment again". The output of one run is helpful but there is no reason to show the output from every time that you ran the program. Theoretical Background Since we are talking about lists and sets in Python, it would be good to provide the reader how these things are different. Since it is possible that you do not know anything about sets, you will need to perform some background research on what these are and then explain to the user how they work and how they compare to a list. Remember that the purpose of the lab has to do with performance so make sure to describe why one might be faster than the other. A statement that "sets are faster than lists" is not sufficient, you need to convince the person reading your report that you understand how finding objects in these two data structures work