Coarray Fortran 2.0 (CAF 2.0) is a set of extensions to Fortran that supports a partitioned global address space model for parallel programming. CAF 2.0 provides a rich set of constructs for managing parallelism, including teams, asynchronous communication, function shipping, collective communication, and synchronization objects.
In this talk, we will introduce two synchronization constructs: Cofence and Finish. Cofence controls the local completion of put, get, and implicitly synchronized asynchronous operations. Finish ensures global completion of asynchronous operations within a team. Together these constructs provide the ability to tune performance by distinguishing between local and global completion of asynchronous operations. We will talk about the subtle interactions between these constructs, asynchronous collectives, asynchronous copy operations, function shipping, and synchronization objects. Together these primitives provide expressive support for writing high performance programs. We will demonstrate the utility of these constructs in the context of unbalanced tree search (UTS) application.
We achieve 91% parallel efficiency for 1024 PEs in UTS(T1WL) which matches the efficiency of the state of the art UTS implementation by Saraswat et al. in the paper titled "Lifeline-based global load balancing".