Parallel efficiency is computed as S/(p * T(p)), where
S represents the wall clock time of a sequential execution of
your program, p is a number of processors and T(p)
is the wall clock time of an execution on p processors.
Don't make the mistake of using
the execution time of a one thread version of your parallel implementation
as the time of a sequential execution. The execution time of the one-thread
parallel implementation is typically larger than that of the original
sequential implementation.
When you compute parallel efficiency, always use the performance of
the original sequential code as a baseline; otherwise, you will
overestimate the value of your parallelization.