Posted on

UBITECH presents a scientific paper on big data streaming and load balancing mechanisms for machine learning at SIMPLIFY 2021, co-located with EDBT 2021

The paper “Weighted Load Balancing Mechanisms over Streaming Big Data for Online Machine Learning” has been accepted for presentation at the “1st International Workshop on Data Analytics and Machine Learning Made Simple” (SIMPLIFY 2021) workshop jointly held with the 24th International Conference on Extending Database Technology (EDBT 2021) that is collocated with the 24th International Conference on Database Theory (ICDT 2021). The UBITECH’s Privacy-preserving Distributed Machine Learning research group designed and implemented a prototype collection of microservices consisting of containerized and deployed pods on Kubernetes, named as Information Aware Networking Mechanisms.

In particular, Istio service mesh [] has been deployed on a Kubernetes cluster to establish pod level communications, delegate traffic flows and filter requests. Besides, Apache JMeter [] has been used to generate the functional behavior which stresses the performance of a web-mobile application which is served by the Kubernetes cluster supporting a distributed object storage and an online machine learning model implemented in Apache Spark MLlib. The ML Model incorporates the Alternating Least Squares (ALS) algorithm and a recommender service which posts personalised suggestions to the end users. The Information Aware Networking Mechanisms take into account the number of requests per second or the volume of data per hour at the web-mobile application and provide weighted load balancing mechanisms to minimize inter-pods communication and prioritize important events to the online machine learning model.