MIFO: A query-semantic aware resource allocation policy
Data Analytics Frameworks encourage sharing of clusters for execution of mixed workloads by promising fairness and isolation along with high performance and resource utilization. However, concurrent query executions on such shared clusters result in increased queue and resource waiting times for queries affecting their overall performance. MIFO is a dataflow aware scheduling policy that mitigates the impacts due to queue and resource contentions by reducing the waiting times for queries near completion. We present heuristics that exploit query semantics to proactively trigger MIFO-based allocations in a workload. Our experiments on Apache Spark using TPCDS benchmark show that compared to a FAIR policy, MIFO provides an improved mean response time, reduced makespan of the workload and average speedup between 1.2x-2.7x in highly concurrent setting with only a momentary deviation in fairness.