Friday, January 31, 2014

High performance processing services in GIS.lab

This week I have attended GIS Ostrava conference in Czech republic with basic talk about GIS.lab. After my speech, I was asked to answer a question about missing Web Processing Service (WPS), namely PyWPS, in GIS.lab. Here is my short answer.

Currently, GIS.lab server acts as conventional server. It provides boot service, file, geo-database or chat server. From the first sight, it would be nice to implement also some WPS service there.
Since I have realized great and mostly free and unused power of bunch of modern multi-core client machines connected to GIS.lab network (which is increased by each connected machine), I have started thinking differently. I have started thinking about distributing load of all CPU intensive services also to all client machines in terms of load balancing. Modern multi-core machine wouldn't be hurt at all, if one of it's cores would be shortly occupied by computing task requested by some other member of GIS.lab network.

For better imagination, I would describe my idea on WMS service provided by QGIS Mapserver on which GIS.lab WebGIS application relies.
Currently, QGIS Mapserver is installed only on GIS.lab server and thus it can provide only limited number of concurrent requests. Moreover, too many concurrent WebGIS users could cause server overload and could slow down also all other server provided services.
Since all resources served by WebGIS are shared to all members of GIS.lab network and all members of network are running same operating system, it seems reasonable to implement load balancing service on top of QGIS Mapserver, which will redirect WMS requests randomly to all network members. In such way, we will improve situation drastically (disk load could be next bottleneck).
All we need, is to have all required server software installed on all machines in GIS.lab network. I am not in favor to install server software to all client machines. I am thinking about creating some processing software image (maybe LXC container) on server which will be mounted to all client machines and used in some chroot like environment.

This is simplified description of my idea, how to create high performance computing network from random office computers in a few minutes. Any comments ?

PS: Thanks for all listening my talk. Jachyme, once I will have this load balancing system, I will surely implement PyWPS on top of it.