Traversals and Accumulations in Object-oriented Algorithms

COMP 310    Java Resources  Eclipse Resources

(Back to Design) (Back to Visitors)

Object-oriented systems consist of interconnected objects. Each object holds some information and is capable of certain operations. An algorithm in an OO system is characterized by a traversal across the objects where the information and capabilities of each object are gathered and/or utilized. The path through the objects is the "traversal" and the information or capability gathering ( think in terms of gathering lambda functions where behavior is a first-class entity that can be treated like data) process is the "accumulation".

The traversal of an OO system is NOT the same as the accumulation of information occurring during the traversal. The traversal is the path taken through the interconnected objects and the accumulation is the process in which information is gathered from each object, which can be independent of the path.

In general, OO algorithms can be broken into two parts: a traversal and an accumulation. Changing either one will often change what the algorithm does, e.g. an algorithm using a list accumulation will change copying to reversing a list depending on the the direction of traversal through the list or which part of the traversal (forward or reverse, see below) in which the accumulation is taking place.

In a system of objects, information and capabilities are spread across the objects; any one object neither has enough information nor enough capability to provide the answer to any request -- think of an algorithm as a process that attempts to produce an answer to a request, e.g. a summation algorithm is trying to produce the answer to the request for the sum of all the objects.

Thus the "job" of an algorithm over an object system is to traverse the objects and accumulate information such that the necessary information is brought to the necessary object behaviors that can be used to produce the desired results. This notion becomes very generalized when one considers behaviors as objects that can be accumulated just like any other information. More informally, one can think of simply bringing together the data and the entities that know how to process it. In more complex system, the notion of time works in as well, i.e.bringing the data and the processing behaviors together at the right time.

Traversals

How a system of interconnected objects is traversed depends on the geometry of the connectivity graph.

Important: The following discussion on traversals is independent of whether the traversals are implemented using delegation-model programming or by other more-procedural programming techniques such as loops.

Linear traversals: For objects connected in a line.

Linear traversals can be accumulated front-to-back or back-to-front or both directions.

Tree traversals: For objects in recursive tree structures

Both depth-first and breadth-first traversals can be accumulated top-to-bottom, bottom-to-top or both directions.

General graph traversals: For objects connected in topologies not covered above.

Note that some algorithms have traversals that don't necessarily cover the entire object graph, such as some "greedy" algorithms.

Once again, the accumulations can occur in one or both of the directions with respect to the traversal.

Delegation vs. Recursion

In delegation-model programming, traversals are created by delegating from one object to another object to create a path through the object graph. The notion of an algorithm says nothing about what sort of behaviors are being invoked on subsequent objects or what processes are being run on them. An algorithm simply does what it needs to do with the subsequent object.

Recursive algorithms are just a special case of delegation-model programming where the behavior invoked on the next object is abstractly equivalent to the behavior being run on the current object.

There is nothing more to recursion than that. Mathematically, recursion is interesting because the recursive process creates a behavioral invariant that can be used in mathematical analysis of the properties of the algorithm, e.g. proving correctness and calculating algorithmic complexity. In terms of OO systems, there's nothing fundamental about recursion other than it allows arbitrarily large and complex processes and structures to be defined in a finite manner.

 

Forward vs. Reverse Accumulation

Since in delegation model programming, a traversal across a network of objects is a result of one object delegating to another object to which it is linked, there is a notion of directionality in the traversal process:

The accumulation taking place in an algorithm can thus be characterized by whether it takes place in the forward or reverse or both directions during the traversal.

Reverse Accumulation:

Information is accumulated as the traversal returns from processing new other objects. This type of accumulation typically only occurs in recursive traversals.

Reverse accumulation is the type of accumulation used in "Recursion 101"-type algorithms where the result is calculated from the return value from processing the rest of the system, e.g. "the sum of a list is zero when the list is empty or when the list is non-empty, the first element plus the sum of the rest of the list".

Reverse accumulation is characterized by using the return value of a delegated process (e.g. recursively delegating to the rest of the list) to calculate the next value of the accumulation.

The fold-right, "FoldR", algorithm is an articulation of breaking a reverse accumulation algorithm down into an invariant linear traversal with a reverse accumulation combined with a variant accumulation process.

Note that the invariant traversal process for reverse accumulation is the same whether or not a base case accumulation value can be defined for the algorithm.

 

Forward Accumulation:

Information is accumulated as the traversal progresses to yet-to-be-processed objects. This type of accumulation can occur in both recursive and non-recursive traversals.

Since information being passed foward during an delegation must be passed as an input parameter, the accumulation process for a forward accumulation algorithm must take (at least) one more input parameter than the initial invocation of the algorithm by its user.

Thus, forward accumulation algorithms are characterized by TWO separate sub-algorithms:

  1. A non-recursive "main" or "outer" algorithm that the user of the algorithm sees that sets up the accumulator for the inner helper algorithm using the first object in the structure. This main processes then passes the accumulator to the next object (delegates to) using the helper algorithm, thus starting off the recursive process.
  2. A recursive "helper" algorithm that processes the rest of the structure. The helper algorithm calculates the next accumulator value using the accumulator it was given from the previous object and information from the current object. The helper then recursively passes the new accumulator to the next object in the structure. The base case of the helper simply returns the accumulator.

The helper algorithm fundamentally takes (at least) one more argument than the main algorithm, i.e. it takes the accumulator as an input parameter.

Both the main and helper algorithms will simply return the returned value from their delegations without doing any additional processing because the necessary processing was already done on the way in. This is the "tail call" characteristic that enables forward accumulation algorithms to be optimized into loops for greater execution performance.

IMPORTANT NOTE: In many discussion about forward accumulation algorithms, the process is described as a single operation that is "seeded" with an initial accumulatior value, e.g. "sum the elements of a list, starting with a base value of 0". These discussion fail to recognize several critical issues:

  1. The description of forward accumulation as a single process that takes a "base case value" fundamentally restricts the applicability of the process to only those types of algorithms where a base case value can be defined. This is not true for all algorithms, such as "find the smallest value in a list". The two sub-algorithm description above does not have this restriction.
  2. The user of the algorithm (the entity that invokes the algorithm to use its results) does not care how the algorithm is implemented internally. To require that invoker supply the base case value for the algorithm is to couple the invoker to the implementation of the algorithm.

The fold-left, "FoldL", algorithm is an articulation of breaking a forward accumulation algorithm where a base case value can be defined, down into an invariant linear traversal with a forward accumulation combined with a variant accumulation process.

 

An algorithm can use forward or reverse or both types of accumulation!

Example of an algorithm that uses both types of accumulation: "return a copy of a list where all occurences of a specified element of the original list has been removed". In this algorithm, the locations in the list of the specified element is accumulated in the forward direction and then in the reverse accumulation, the forward-accumlated information is used to accumulate a new list with the selected element removed.

Using the same accumulator may not produce the same results for forward vs. reverse accumulation traversals!

Some algorithms are independent of the accumulation direction, e.g. "sum all the elements of a list" where the accumulator simple sums all the values it is given. On the other hand, some algorithms are direction-sensitive. For example, an accumulator that accumulates a list made from the given values, will copy a list in its original order when used with reverse accumulation (e.g. with FoldR) but will copy the list in reverse order when used with forward accumulation (e.g. wtih FoldL). Mathematically, this issue is linked to the "associativity" of the accumulation process.

 

 

 

 

 

 

 

 

 

 

© 2020 by Stephen Wong