CP164: Winter 2024 Midterm Notes

The following are comments on the midterm. Please review this material before bringing any questions about your midterm to the instructor. You should also try completing your solutions in Eclipse to see how well they works.

Benefit of the Doubt (BD) Marks: A BD mark was added to your midterm grade. As always with BD marks, should you request a remark of any midterm question, the BD mark will first be removed from your recorded midterm grade.

The array-based data structures are the easy part of the course. If you can't handle these, you're not going to do better in the last half of the course. At the very least, talk to us about your progress.

Kindly do not ask us "for just a few marks so I can pass the midterm". You don't have to pass the midterm to pass the course. We don't hand out free marks just to make a certain grade level. If you did poorly on the midterm, you have an opportunity to improve for the exam, assuming you stay in the course. You must, however, pass the final exam to pass the course. No Exceptions.

If you passed the midterm remember that a bare pass is not going to get you into 2nd year. You need to demonstrate mastery of this material, and 55% isn't really doing that.

We do not provide the complete code for the answers. It would be a very good idea for you to go through the midterm on your own and attempt to complete and test it thoroughly as good practice.

If your code was not graded due to syntax errors, we will correct and regrade your code with a deduction. You should not have any syntax errors - the testing code provided to you will show those immediately. Correct them before moving on.

If you are still struggling with project importing and exporting, go to a lab and get help. The project given you had the correct name and files. If your project was mis-named, you did something wrong during the import process. If you got muliple projects in one project, you imported it wrong. If the validation gave you errors, you did something wrong during the export process. You have had a great deal of practice with exporting by now - there should be no reason to get this wrong. Attention to detail is key. Every term some students hand in the wrong project or the original empty midterm. This is a Bad Idea.


General Comments

Remember basic CP104/CP164 rules:
  • get basic syntax correct - Eclipse often warns you when there is a problem, pay attention! Running the testing provided usually demonstrates these errors immediately.
  • only one return statement per function/method
  • do not use break
  • do not use continue or pass
  • do not use the Python list remove method
  • do not name your variables the same as the function/method name
  • use appropriate loops - not everything is amenable to being solved by a for loop
  • when working in a class do not call high-level interface methods from within another high-level interface method
Read and apply method and function docstrings carefully:
Get the number and types of parameters correct, and the number and types of returned values correct.

The emphasis on this midterm was the basic concepts of linear data structures and good data structure design. None of these concepts were radically different than those already seen, they were just in a slightly different context.

Use the testing given to you. It is better to hand in a smaller amount of working code than a large amount of non-working code. We explicitly gave you testing code to save you time and effort. Make use of it.

'Open book' does not mean 'easy', it simply gives you access to all of the material you have been working with during the term. The amount of time required to look through existing code, or Googling it on Stack Overflow, or generating it with ChatGPT is costly. Having practiced writing this code so that it immediately comes to mind is a much better, less time-wasting approach.

If you struggled with the CP104 aspect of the midterm - for example, basic programming structures, variable usage, fixing syntax errors - unless you improve your understanding and practice with this material immediately, you're not going to do well in the rest of the course. CP164 assumes you have mastered the CP104 material.

Pay very close attention to the Requirements. You are explicitly warned that if you do not meet the requirements for a task, you may be given a zero for that task. For example, you must clearly demonstrate that you understand the difference between use and extend.


Comments on Set_array and Sorted_Set_array

The array-based Sets are very similar to the other array-based data structures you have already used
Except that values may appear only once. This is significant in add and append - which you were given - and the intersection and union methods
If you are using 'alternative' sources because the midterm is open book, pay close attention to the actual data structure attributes and don't copy things blindly:
_values is the name of the underlying Python list. Any other name causes errors in your testing, so pay attention to the test results.
Pay close attention to the method docstrings
Methods that return another Set must return that, not a Python list, or a Python set().
Make use of the testing you were given

We gave you testing so that you didn't have to write your own. If in running this testing your code never got past the add method, this is a bad sign. Pay attention to the errors and fix them, particularly for the smaller, basic methods.

Pay attention to what the testing is telling you. Just getting a result is not sufficient, it needs to be a correct result, and you need to figure out that that is. The concept of a set is fairly simple and the testing samples illustrated that. Also understand when you have written an infinite loop. The testing seemingly hangs, but Eclipse is telling you that your code is still running.

Using High-Level Public Interface Methods

Which we will refer to as interface methods from this point on.

We have endlessly emphasized that in class implementations you should not use interface methods within other interface methods. It has been discussed in lecture notes, emails, and appears in lab and assignment gradings. The main reason to do so is to avoid having methods do more work than is necessary. Here is a textbook example of how to misuse interface methods. The code is from the intersection method:

      
if new_set.find(self._values[i]) is not None:
    new_set.add(self._values[i])

    

(or it calls append depending on the version of the Set.)

What happens in this algorithm?

  1. The find method is an interface method that calls either _linear_search or _binary_search to determine if a value is already in the new set. These search methods walk through the set looking for a value that matches the parameter.
  2. Assume that find doesn't find anything, the add (or append) method then calls either _linear_search or _binary_search to determine if a value is already in the new set, and if it is not, adds that value to the set.

In other words, if the value is not in the new set, _linear_search or _binary_search is called twice with the exactly the same parameter to return exactly the same result both times.

This is incredibly pointless. It is too much work, and as the set grows in size, the amount of work increases linearly. You need to demonstrate to us that you understand what is going on under the hood with these methods and that you can design an algorithm that minimizes the amount of work done. You accomplish this by working with the underlying data structures (in this case _values) directly and only doing as much work as is minimally necessary.

Note that this is not limited to using interface methods. Here is a poorly written linear search algorithm:

      
if key in self._values:
    while i < len(self._values) and self._values[i] != key:
        i += 1
    

The problem here is very similar to the previous problem. The line
if key in self._values:
uses the in operator to walk through the contents of self._values. (Look at your own definition of the linked __contains__ to see what is going on under the hood.) If the in finds key in the list, it then walks through the list again, looking for the same key value. The only difference is that it now tells you where the key is, instead of simply whether or not the key is there. Again, pointless. There is no need whatsoever to use the in operator, whether it is an interface method or not.