[Texas PLT logo]

COMP 202: Principles of Object-Oriented Programming II

  Marine Biology Simulation: Comparison  

Many of you may be familiar with the College Board's AP Marine Biology Case Study, which has been part of the AP Computer Science curriculum since the 1994-95 school year. During the summer of 2003, it was converted from C++ to Java. Since we are always looking for good examples and homework problems for you, we decided to take a look at it.

The AP Marine Biology Simulation Case Study - Teacher’s Manual makes the following claim: “Through the AP Marine Biology Simulation Case Study, the strategies, vocabulary, and techniques of object-oriented design will be emphasized.” Unfortunately, our analysis showed that the current implementation of the case study was severly deficient in several aspects and not suitable for use in our curriculum. To a certain extent that is understandable: The AP MBS was designed for high school students, not for college students in an object-oriented curriculum.

We nonetheless liked the idea of the MBS case study and therefore decided to redesign it for our pursposes. We made many changes and now feel that it truly emphasizes object-oriented design. We realize our formulation only works in courses like Comp 202, not in the current AP curriculum for high schools. We still claim that many of the improvements could still be taught even in high school.

In this lab, we will compare the two versions of the case study. We will first look at the AP MBS and try to identify a weakness. Then, we will think about how to improve it, and finally see what decisions we made for the Rice MBS.


Part 1: Comparison of the AP MBS and the Rice MBS

1.1: Tight coupling between fish and environments

One of the first things we tried was to find a way to break the AP MBS. Were there any weaknesses that allowed us to get unexpected behavior, perhaps even a crash? Out of all the properties a program should have, correctness is the most important one.

Indeed, we found an easy way to crash the program. We just had to write a fish that ignored a small comment in the original AP MBS code: "Precondition: obj.location() is a valid location and there is no other object there".

This fish jumps around randomly. It compiles, it uses the provided library... It just doesn't make sure that the place it jumps to is empty. After a while, the program will throw an exception and terminate.

Why should the fish have to check that the destination is empty? Why does the environment let the fish do something that will definitely crash the program? Something is wrong here. And there is more: The location of the fish is stored in both the fish and the environment. The fish works directly with stuff in the environment.

What is bad about these things?

First of all, the environment cannot protect itself from badly or maliciously written fish. The environment should only allow a fish to do things that are safe. Second, a programming error might let us desynchronize the locations stored in the fish and the environment. It is a bad idea to have several copies of the same data; if we change the data, we have to remember to change it in all places. And third, what if we wanted to change the environment? If fish and environment are so tightly coupled, we have to change all the fish.

All of these things are essentially coupling issues: The environment shouldn't have to know anything about the fish, and the fish shouldn't have to know anything about the environment. If we decouple them properly, all these problems should go away.

We definitely wanted to prevent the fish from doing bad things. At the same time, a fish should be able to do all the things it could do in the AP MBS (well, all the safe things; crashing the program == bad). We wanted to change the environment without affecting fish. And we also wanted to store the location in only one place... And if you don't really know anything about the environment, what exactly is a location anyway?

We demonstrated our solution in a presentation at SIGCSE 2004. Here is a different version of the presentation that does not give away the answers to your project #3. Sorry :-)

By using a local environment as layer of abstraction between the global environment and the fish, the fish didn't have to know anything about the environment to use it. The concept of a location is entirely abstract and depends on the global environment used. Regardless of what environment a developer had in mind when he or she wrote a fish, the fish will work as well as it can in any environment.

A fish doesn't tell the environment to change its location. It asks if it can change it, and only then does the environment get the ability to actually do so. This prevents the fish from doing anything the environment does not want. We can easily show all of this in a demo.

By finding the proper abstractions and decoupling the environment from the fish, we were able to create a correct solution that is also more flexible and extensible than the original program.

1.2: Integers used for Run and Seed Modes

The MBS allows you to choose different modes of running the simulation and setting the seed for the random number generator. There are three modes for running:

Similarly, there are three ways to seed the random number generator. Computers can't actually produce random numbers ("Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." - John von Neumann, 1951). Random number generators (or more precisely, pseudo-random number generators) thus are complicated algorithms based on number-theoretic properties that take some inputs and then generate a sequence of numbers that looks random, at least to a certain degree. If you give it the same input, called the seed, you will always get the same sequence of numbers. In experiments, this is actually useful, since it allows you to repeat a "random" experiment.

The MBS lets you specify the seed in three different ways:

Let's take a look at how the AP MBS implemented this kind of thing by examining code for the "seed mode":

This source code sets an integer flag whenever the ActionListener is invoked. The flag is immediately checked to find out if we need to ask for the number of steps now. When the start button is pressed, that flag is checked again to see if we have to ask for the number then. And after every step, we compare the number of steps we have taken to the maximum number of steps we should take. "0" as maximum number of steps is a special code for "run indefinitely".

What is problematic about this way of doing things?

There are four places things need to happen: One place where something is actually changed, but three additional places that are affected. Right now, there are only three modes, but if we were to add more, we would have to make changes in all four of these places. If we forget a change in one of the places, we have introduced a bug. If we generalize this to n options and m places where a change needs to be made, we can quickly see that this approach does not scale.

Let's think about how we can improve this.

We can't really do much about the code that reacts to the user input. However, in all the other cases, we can make one generalization: We know something has to happen (this "something" might be nothing)! We also know that this "something" is determined at the time the users make their choices. What we are doing here is separating variants (what exactly needs to happen) from invariants (something has to happen). Is there a way we can somehow put all the invariants in the code reacting to user input?

Sure! This is what the command pattern is for. A command can be used for "delayed execution": We already know what we want to do, we just don't want to do it yet. To implement this pattern, we need an interface (here ILambda) with a single method (apply). Whenever the user selects a menu option, we make all the right ILambdas and store them somewhere for later. When the time has come to do something, we call the apply method for the stored command. We don't have to know what it does anymore, it will just do the right thing.

Here are snippets from the Rice MBS:

Maybe it is a little more code right now, but the code is easier to understand and much easier to extend. All the tough decisions are made in one place in MBSView.java. In all the places affected, we just call an abstract method and let polymorphism take care of the rest. If we were to add another mode, we'd have to change only MBSView.java.

1.3: Hard-coded environment creation

The AP MBS provided two different kinds of environment: A rectangular, grid-based, bounded environment requiring two parameters (width and height), and a grid-based unbounded environment that does not require any parameters (it's infinite along both axes). The number of parameters to an environment has been hard-coded to either zero or two.

What if we wanted to add another kind of environment that took a different number of parameters? What if we wanted to allow the user to load new kinds of environments at runtime, without having any idea what they need? Clearly, the previous approach is a dead end.

Let's take a step back and think abstractly about this problem. When the user pushes the "Create" button in the "Create new environment" dialog box, we know one thing: Regardless of what he has selected and what data he has entered, he wants to create a new environment. We don't know what specific thing we have to create, but we do know that we have to create an environment right now. In the past we have used the abstract factory pattern for this kind of abstract creation. In the Koch curve project, the user selected a factory, and whenever the "Reset" or "Grow" buttons were pressed, the right kind of curve was created We can do the same thing here. We have a factory for environments, and whenever the "Create" button is pressed, we ask that factory to make a new environment.

There is one problem though: In the Koch curve project, all the factories took the same parameters: All they needed to know was what the previous factory was. Here, we have no idea what a factory needs to know! The user enters data into fields in the dialog, and that data has to go to the factory.

Oh, by the way, how does the program even know what fields to display in the dialog? What determines what fields need to be shown? Depending on what environment has been selected, different fields are shown, so the appearance of the dialog is controlled by the factory. In a way, you can say that the environment creates the dialog. We don't know what environment is selected, we don't know what to display, all we know is that we need this dialog right now so we can display it. Does this situation sound familiar?

It is another case of abstract creation: The environment classes act as abstract factories for the dialogs, and the dialogs act as factories for enviromnents. Pretty twisted, hmm? Let's look at some code from the Rice MBS:

At the beginning of the program, we simply put all the names of the environments in a combobox. Whenever the user selects one, we load the corresponding class at runtime (we can do that using Java's "reflection" capabilities). Now we have an AGlobalEnv, but we don't know much about it. We don't have to know very much, though, because we definitely know there is a makeEnvFactory method.

We call this method and get an instance of AEnvFactory back. This class acts as JPanel and gets displayed in the dialog, but it also is the factory to create the desired environment. Inside the dialog, we add the instance and display the settings. Whenever the user is done and pushes the "Create" button, we pass the AEnvFactory instance to the model. It can call the AEnvFactory.create method to get exactly the environment the user desired without having to know what kind of environment it is, how many parameters were needed, or what values the user picked.

With this setup, we can add an arbitrary number of environments at runtime, and those environments can take a completely arbitrary number of parameters of arbitrary types. If we want to have an environment where we can select the color of the water, we can do it (see NoGridEnv), and all because of abstract creation: abstract factories creating abstract fractories creating environments.

Again, all we did was separate variants (what do we need to display in the dialog? / what parameters does an environment take?) from invariants (we need to display something! / we need to create an environment!). The result was an abstraction that makes the program much easier to extend.

1.4: Tight coupling between model and view

In the AP MBS, view and model were tightly coupled. There were no adapters between the view and the model, and both called methods in the other part of the program directly. One of the first things we did when we re-engineered the MBS was to introduce an MVC pattern, more out of habit than out of real need.

Three days before the deadline for the paper in which we described the Rice MBS, we completely changed almost the entire model. After we were done writing the new model, it was late at night and we estimated we would have several hours of work to do to wire the new model to the old view. It would be tedious, but at least it would be easy work.

Unexpectedly, we were done in several minutes. All we needed to do was change the adapters, nothing had to be changed in the view. The MVC pattern paid of and bought us some additional hours of sleep!

  Marine Biology Simulation: Comparison  

URL: http://www.clear.rice.edu/comp202/08-fall/lectures/ricembs2/index.shtml
Copyright © 2008-2010 Mathias Ricken and Stephen Wong