In this work, we focus on the use of implicit information for scheduling communicating jobs in a cluster of workstations. The traditional time-sharing approach for communicating jobs has been explicit coscheduling, where processes of a parallel job are explicitly run at the same time across processors. We propose an alternative, implicit coscheduling, where local schedulers in the system dynamically coordinate scheduling by observing only implicit information, such as round-trip time and arrival rate of messages.
In this talk, we describe our experience with developing implicit coscheduling, from our initial simulations, to our prototype implementation, and back to more simulations. In addition to showing the performance of implicit coscheduling on a variety of workloads, we highlight the strengths and weaknesses of using simulation and implementation for evaluation.