Search:
Computing and Library Services - delivering an inspiring information environment

Robust mouldable intelligent scheduling using application benchmarking for elastic environments

Kureshi, Ibad, Holmes, Violeta, Cooke, D., Allan, R., Liang, Shuo and Gubb, D. (2012) Robust mouldable intelligent scheduling using application benchmarking for elastic environments. In: Proceedings of The Queen’s Diamond Jubilee Computing and Engineering Annual Researchers’ Conference 2012: CEARC’12. University of Huddersfield, Huddersfield, p. 156. ISBN 978-1-86218-106-9

[img]
Preview
PDF (Cover page) - Published Version
Download (1164kB) | Preview
    [img]
    Preview
    PDF (Abstract) - Published Version
    Download (66kB) | Preview
      [img]
      Preview
      PDF (Poster) - Published Version
      Download (441kB) | Preview

        Abstract

        In a green IT obsessed world hardware efficiency and usage of computer systems becomes essential.
        There is a multiplier effect when this is applied to High Performance Computing systems. With an
        average compute rack consuming between 7 and 25kW it is essential that resources be utilised in the
        most optimum way possible. Currently the batch schedulers employed to manage these multi-user
        multi-application environments are nothing more than match making and service level agreement
        (SLA) enforcing tools. System Administrators strive to get maximum “usage efficiency” from the
        systems by fine-tuning and restricting queues to get a predictable performance characteristic, e.g. any
        software package running in queue X will take N number of cores and run for a maximum of T time.
        These fixed approximations of performance characteristics are used then to schedule queued jobs in
        the system, in the hope of achieving 100% utilisation. Choosing which queue to place a job in, falls on the user. A savvy user may use trial an error to establish which queue is best suited to his/her needs, but most users will find a queue that gives them results and stick to it – even if they change the model being simulated. This usually leads to a job receiving either an over or under allocation of resources, resulting in either hardware failure or inefficient utilisation of the system. Ideally the system should know how a particular application with a particular dataset would behave when run.
        Benchmarking Schemes have historically been used as marketing and administration tools. Some
        schemes like Standard Performance Evaluation Corporation (SPEC) and Perfect Benchmark used
        “real” applications with generic datasets to test a systems performance. This way a scientist looking for a cluster computer could ask questions such as “How well will my software run?” rather than “How many FLOPS can I get out of this system?” If adapted to include an API to plug in any software to benchmark and to pass results to other software, these toolkits can be used for purposes other than sales and marketing. If a job scheduler can get access to performance characteristic curves for every application on the system, optimal resource allocation and scheduling/queuing decisions can be made at submit time by the system rather than the user. This would further improve the performance of
        Mouldable schedulers that currently follow the Downey model. Along with the decision-making
        regarding resource allocation and scheduling, if the scheduler is able to collect a historic record of simulations by the particular users, then further optimisation is possible. This would lead to better and safer utilisation of the system. Currently AI is used in some decision making in Mouldable schedulers. Given a user inputted variance of resources required the scheduler makes a decision on resource
        allocation by selecting from the available range. If the user supplied range is incorrect, the scheduler is powerless to adapt, and on a next run cannot learn from previous mistakes or successes. This project aims to adapt an open-framework benchmarking scheme to feed information to a job scheduler. This job scheduler will also use gathered heuristic data to make scheduling decisions and optimise the resource allocation and the system utilisation. This work will be further expanded to include elastic or even shared resource environments where the scheduler can expand the size of its world based on either financial or SLA driven decisions

        Item Type: Book Chapter
        Uncontrolled Keywords: scheduler, Mouldable benchmarking, elastic cloud, HPC, batch processing, cluster, grid
        Subjects: T Technology > TA Engineering (General). Civil engineering (General)
        Schools: School of Computing and Engineering
        School of Computing and Engineering > Computing and Engineering Annual Researchers' Conference (CEARC)
        School of Computing and Engineering > High Performance Computing Research Group
        School of Computing and Engineering > Systems Engineering Research Group
        Related URLs:
        Depositing User: Sharon Beastall
        Date Deposited: 03 May 2012 12:55
        Last Modified: 03 May 2012 12:55
        URI: http://eprints.hud.ac.uk/id/eprint/13492

        Document Downloads

        Downloader Countries

        More statistics for this item...

        Item control for Repository Staff only:

        View Item

        University of Huddersfield, Queensgate, Huddersfield, HD1 3DH Copyright and Disclaimer All rights reserved ©