Hungarian Supercomputing Grid

Kacsuk Péter, prof. Dr. <kacsuk@sztaki.hu>

MTA SZTAKI


Recently a 96-processor Sun HPC 10000 supercomputer, two 16-processor Compaq Alpha Server supercomputers, a 58-processor cluster and several smaller clusters were installed in Hungary as major supercomputing resources. All are placed at different institutes and are used by a growing user community from academy. However, even in this early stage of their use, it turned out that there exist applications where the capacity of individual supercomputers and clusters are not sufficient to solve the problems in reasonable time. The solution for this problem is to connect these high-performance computing resources by the Hungarian academic network and to use them jointly in a newly formed supercomputing Grid.


One of the main goals of the project is to establish this Hungarian Supercomputing Grid (HSG) based on the current Hungarian and international results of cluster and Grid computing. The project is strongly related with two already running Hungarian Grid projects (NI2000/08, DemoGrid) and several national projects from other countries (Condor, INFN Grid, UK e-science). The SuperGrid project is based on the experiences learned in the NI2000/08, INFN Grid and DataGrid projects and will strongly collaborate with the DemoGrid, Condor, INFN Grid projects and the UK e-science programme.


Unlike the Grids to be developed in the previously mentioned Grid projects, the HSG will be used as a high-performance and high-throughput Grid. In order to achieve these two features Condor will be used as the main Grid level job manager in the HSG and will be combined with P-GRADE, a Hungarian-produced high-performance program development environment.


HSG will have a layered structure. The top layer is the application layer where currently a Monte-Carlo method based nuclear physics application is investigated. The user will access the HSG by the Grid portal to be developed in the project. The application will be developed in the P-GRADE parallel program development environment, which will be connected to Condor in the project. It means that the user can generate directly Condor jobs (containing parallel PVM or MPI program) from the P-GRADE environment. Condor will be used as the Grid level resource manager in the HSG. The basic middleware services will come from Globus. The fabric will contain the Hungarian supercomputers and clusters connected by the Hungarian academic network. On the supercomputers and clusters local job schedulers like LSF, PBS, Condor and Sun Grid Engine can be used.


Beside developing a Grid portal, combining and extending Condor and P-GRADE the other two main tasks of the project are to solve the security problems and to develop an accounting system.