Posted on

UBITECH co-authors a scientific publication on converging HPC, Big Data and Cloud technologies for precision agriculture data analytics

A scientific paper entitled “Converging HPC, Big Data and Cloud technologies for precision agriculture data analytics on supercomputers” has been co-authored by Ryax Technologies, HLRS High Performance Computing Center Stuttgart, Poznan Supercomputing and Networking Center, CINECA, National Technical University of Athens, LeanXcale and UBITECH, and is presented at the 15th Workshop on Virtualization in High-Performance Cloud Computing (VHPC 20) held in conjunction with the International Supercomputing Conference – High Performance, between June 21-25, 2020 in Frankfurt, Germany. In this paper, Dr Sophia Karagiorgou and her co-authors discuss the orchestration of precision agriculture and livestock farming analytics workflows across hybrid IoT, Big Data, HPC and Cloud infrastructures. As a matter of fact, the convergence of HPC and Big Data along with the influence of Cloud are playing an important role in the democratization of HPC. The increasing needs of Data Analytics in computational power has added new fields of interest for the HPC facilities but also new problematics such as interoperability with Cloud and ease of use. Besides the typical HPC applications, these infrastructures are now asked to handle more complex workflows combining Machine Learning, Big Data and HPC. This brings challenges on the resource management, scheduling and environment deployment layers. Hence, enhancements are needed to allow multiple frameworks to be deployed under common system management while providing the right abstraction to facilitate adoption.

The paper presents the architecture adopted for the parallel and distributed execution management software stackof CYBELE H2020 co-funded project which is put in place on production HPC centers to execute hybrid data analytics workflows in the context of precision agriculture and livestock farming applications. The design is based on: Kubernetes as a higher level orchestrator of Big Data components, hybrid workflows and a common interface to submit HPC or Big Data jobs; Slurm or Torque for HPC resource management; and Singularity containerization platform for the dynamic deployment of the different Data Analytics frameworks on HPC. The paper showcases precision agriculture workflows being executed upon the architecture and provides some initial performance evaluation results and insights for the whole prototype design.