Comp310 Lec19

COMP 310 Spring 2017	Lec19: Visitors...

Let's continue our discussion of visitors from last lecture.

The Visitor Design Pattern is used for type-dependent processing.

Visitors decouple algorithms from data structures.

Important points to remember:

The hosts for a visitor are different types because each host is a different class.
- Each host is a particular concrete implementation of an abstract data structure.
A visitor contains the different cases that an algorithm needs to handle different types.
A visitor is an encapsulation of the entire algorithm that encompasses the processing of all possible types.
Different visitors represent different algorithms.
Visitors are physically separate from their hosts.
- That is, the algorithms are decoupled from their data structures.
- This decoupling enables the behavior of the data structure (i.e. the algorithms in which the hosts in that data structure participate) to be extended without modifying the data structure itself.
Visitors implement type-dependent processing through delegation.
- No conditionals are used, instead control is passed through the host since the host unequivocally knows its own type.

Here is a nice write-up about the visitor pattern by a former Comp310 student, Nav Ravindranath (Fall 2014), who rediscovered the visitor pattern when he got to industry: http://blog.delphix.com/nav-ravindranath/2016/04/27/accepting-the-visitors/ Note how coming to understand design is an iterative process of continually re-evaluating one's own work and thoughts.

Consider the following Visitor Design Pattern implementation:

visitors

Run the on-line Visitor demo.

Also, see the List Framework (listFW) visitor demo.

Important Visitor Design Pattern Features

A host calls its corresponding case on the visitor -- That's the only thing the host knows how to do. It's case is the only case it is aware of. In some sense, this behavior defines the host.

A host executes a visitors without semantic -- the host has no idea what a visitor does.

A visitor simply provides the services of its cases -- A visitor does not know when or how it will be used.

Hosts provide intrinsic, atomic behaviors for the visitors to use -- The hosts must provide the host-dependent behaviors that cannot be done by anyone else.

Visitors create algorithms by combining the atomic, intrinsic behaviors of their hosts -- Visitors are fundamentally external to the hosts and thus visitor behavior can only be based on the public methods offered by the hosts.

The hosts and the visitor are mutually dependent on each other -- a host needs a visitor to execute its algorithms and a visitor needs a host to provide its intrinsic behaviors.

The number of hosts is invariant -- The number of host classes is tied to the number of cases the visitor provides because there is a one-to-one relationship between the two. This is a critical design decision.

The number of visitors is variant - A finite number of hosts can support an arbitrary number of visitors.

Visitors do NOT depend on the inter-relations between hosts -- It is a common myth that visitors are tied to recursion and composite data structures. Visitors are usable on any abstract host hiearchy where distinguishable hosts are identifiable.

To run a visitor-based processing of a host, neither the visitor nor the host concrete types needs to be known -- The visitor system runs at the top interface levels so it decouples from the specifics of which type of host is currently present and which type of visitor is currently being used. Arbitrary processing of arbitrary data.

The "Sweet Spot" for Visitors

Visitors are the most effective in situations where multiple or changeable algorithms (processing) need to be run on objects at a point in the system where the type of the objects is not known but yet the algorithm depends on the object's type.

This is a surprisingly common situation in highly abstract systems.

Additional Lecture Materials (All Required!)

Using a Visitor to Process Two Unknown Hosts

Visitor Exercises: Required, though do not turn in. Complete BEFORE attempting homework!

Extend the on-line Visitor demo:
1. Write at least 2 more visitors that have distinctly different outputs for each host.
2. Add at least one text field that allows an input parameter to be passed to the visitor when it is executed. Write at least 2 visitors that show both host-dependent and input parameter-dependent behavior.
Extend the List Framework (listFW) visitor demo:
1. Write the following algorithms in each of the following manners: as a reverse accumulation-style visitor, as a forward accumulation-style visitor, as an accumulator with FoldRAlgo and as an accumulator with FoldLAlgo:
  1. Return the largest value in a list of integers. (Might be useful in some implementations: Integer.MIN_VALUE is the smallest possible Java integer value)
  2. Return a boolean indicating whether or not a given element is in the list.
2. Write the following algorithms in each of the following manners: as a forward accumulation-style visitor and as an accumulator with FoldLAlgo:
  1. Return the last element of the list (why can't you do this with reverse accumulation?)
3. Write the following algorithms in any style you wish, maybe even a combination of styles:
  1. Sum the elements in a list of integers from the n'th element to the end of the list.
  2. Return a list with the smallest element from the list removed. Do this with a single pass through the the list!

You are absolutely encouraged to attempt more algorithms! The more practice you get in writing visitor algorithms, the better off you will be for the rest of the semester!

Things to Remember About Visitors:

Always remember these tips to help prevent common mistakes when working with visitors!

The hosts are the data structure and the visitors are the algorithms
- The number of hosts is fixed in the standard Visitor Design Pattern.
- Every visitor has a case for every type of host.
- The code for the hosts is invariant and is only written once. Any given host calls its associated case on the visitor: for HostX.execute() --> return visitor.caseHostX(this, params)
To run an algorithm, delegate to the data: result = host.execute(visitor, params)
- Always delegate to the next object to process it. Don't violate its encapsulation by extracting internal information from it!
- Recursion is a special case of delegation where one is delegating to the next object with the same visitor, e.g. for a list: host.getRest().execute(this, params)
No conditionals are needed to do type-dependent processing!

Why not just override the visitor's `caseX()` methods so they all have the same name but different overall method signatures?

If one looks around the web, one will often run across implementations of the visitor pattern where instead of having case methods like this where each method's name corresponds to its semantic,

public interface IVisitor {

	public Object caseA(HostA host, Object... params);
	
	public Object caseB(HostB host, Object... params);
	
	public Object caseC(HostC host, Object... params);
}

the implementers instead override the case methods so that all the cases have the same method name and the compiler automatically distinguishes between the cases by the invoked signature, i.e. by the host parameter's type:

public interface IVisitor {

	public Object case(HostA host, Object... params);
	
	public Object case(HostB host, Object... params);
	
	public Object case(HostC host, Object... params);
}

In this situation, the code for each host's execute() method looks like this:

Object execute(IVisitor visitor, Object... params) {
	return visitor.case(this, params);   // the type of "this" depends on the type of the host 
}

This seems like a nice shortening of the code with perhaps the advantage of fewer typo errors by calling the wrong case by the wrong host because all the code looks the same and the proper dispatching is handled by the compiler.

This is a very common question, so it bears spending a little time in discussion.

Semantic clarity

It is true that technically, Java utilizes the entire method signature, which includes not only the method name but the types of the input parameters, to determine how to dispatch a call, that is to decide which specific method to actually call in any given situation. Java thus allows a developer to override a single method name with multiple methods, so long as each overridden method has unique parameter types and order. The above execute method's code will "work" because the type of the host known to the compiler and thus the compiler can figure out which of the overridden case methods to call when each of the different host types calls the case method.

The problem is that the different case methods are fundamentally different in semantics, that is, each case has the particular semantic of processing a specific type of host. The semantics of the different cases are NOT equivalent. But when one looks at the code above, since the code is identical from one host to another, the semantics of the call are not obvious from just reading the words of the code; one has to take the additional step of realizing that the semantics of the call are bound up in the type of the this parameter (the host).

On the other hand, the semantics of a call that is "return visitor.caseA(this, params)" are patently obvious because we can see right away that we are calling the case to process a type "A" host.

The purported time and error savings of overriding the case methods is overstated because remember that the host code is on the invariant side of the system. That means that while the hosts' execute methods apear to be easier to write, this is only ever happening once. The host code is written once and it is done forever, so the time and error savings are minimal at best but at the cost of semantic clarity.

For large systems, semantic clarity is paramount for insuring that the system behaves properly. For that reason, here we will always favor semantic clarity over code simplification.

Mutable hosts

The implict assumption that we have made in all of discussions so far is that the "type" of the host is immutable, that is, the apparent type of a host never changes. In fact, the visitor implementations that utilize case method name overriding rely on this invariant.

But if one closely examines the the requirements of the visitor design pattern, we see that the only required invariant is that a "host call its associated case". There is no statement that says that the case called is the same case for all time, that is, the case a host calls is time invariant. In fact, one way to define a "host" is to turn the visitor pattern around and state, "HostX" is that entity which calls the visitor's "caseX()" method. Here, the visitor actually defines the host!

In fact, there are objects whose behavior changes in such ways that they literally appear to change type. This phenomenon is called "dynamic reclassification" is is modeled by the State Design Pattern. In these mutable data structures, any given host will call different cases on the visitor depending on what has happened to the host in the past. Examples of such data structures, including how they work with visitors, are mutable lists (Mutable Linear Recursive Structure) and the mutable binary tree (Binary Tree Structure). Unfortunately, a detailed discussion of dynamic reclassification is beyond the scope of this class -- see the beginning of COMP 405/505 .

Because the host's apparent type is mutable, the case methods cannot be typed to a specific host subtype (which doesn't even exist publicly), instead all cases must accept the top-level mutable host type:

public interface IVisitor {

	public Object caseA(MutableHost host, Object... params);
	
	public Object caseB(MutableHost host, Object... params);
	
	public Object caseC(MutableHost host, Object... params);
}

In this situation, the technique of overriding the case methods with the same method name will NOT work at all because the cases are indistinguishable if the method names are identical. But by retaining case method names that are linked to their semantics, even "type-changing" hosts have no problem with always invoking the proper case on the visitor for the current "type" (state) of the host.

(Note: To insure that proper type-dependent processing of the host's current "type", algorithms for mutable data structures must always delegate to the host for any processing that might be type-dependent, i.e. everything is written as visitors. The hosts are built to always call the proper case on the visitor corresponding to their current "type"/state.)

Conclusion

While overriding case methods to all have the same name but different host type parameters will compile and execute just fine for immutable host data structures, the technique gives up semantic clarity for minimal developer advantages. Plus, when extending the notion of visitors to mutable data structures, that technique breaks down completely.

Favoring semantic clarity over convenience not only helps in clarifying, emphasizing and ensuring proper system behavior, overall extensibility and flexibility is enhanced because a system is defined by its semantics not by its code.

COMP 310 Spring 2017

Lec19: Visitors...