July 23, 2013
Age Of Astronomy Big Data Is Already Here
High-performance computing specialists from Perth's International Centre for Radio Astronomy Research (ICRAR) today became the first users of one of Australia's leading supercomputing facilities -- the Pawsey Centre -- ahead of its official opening later this year.
"We now have more than 400 megabytes per second of MWA data streaming along the National Broadband Network from the desert 800km away," said Professor Andreas Wicenec, from The University of Western Australia node of ICRAR.
The Murchison Widefield Array is the first Square Kilometre Array precursor to enter full operations, generating a vast torrent of information that needs to be stored for later retrieval by researchers.
"To store the Big Data the MWA produces, you'd need almost three 1TB hard drives every two hours," said Prof Wicenec. "The technical challenge isn't just in saving the observations but how you then distribute them to astronomers from the MWA team in far-flung places so they can start using it."
There are currently two links between the data stores in Perth and MWA researchers at the Massachusetts Institute of Technology (MIT) in the United States and the Victoria University of Wellington in New Zealand. A future link to India -- another MWA partner -- will also be created.
"Not everyone needs all of the MWA data," said Professor Wicenec. "For example, MIT researchers are interested in the early Universe so we use filtering techniques to control what data is copied from the Pawsey Centre archive to the MIT machines. So far, more than 150TB of data has been transferred automatically to the MIT store, with a stream of up to 4TB a day increasing that value."
Professor Wicenec said the MWA is producing so much information that it would be impossible to manually decide where to send what, which is where a sophisticated archiving system -- the open-source Next Generation Archive System (NGAS) -- comes in.
"Controlling data for a widely distributed user group on this scale is a challenge that's being faced more and more frequently in science and other fields, but nothing suitable existed that could solve this problem for us," said UWA Associate Professor Chen Wu.
NGAS was initially developed by Professor Wicenec while he was at the European Southern Observatory and later modified by the ICRAR team to meet the MWA data challenge.
Associate Professor Wu said NGAS is very advanced -- it doesn't matter where data is stored, you simply ask the system for what you want and it either provides it from the local store or retrieves it from the full archive back in Perth through a highly efficient dataflow management system.
About half of all MWA computing occurs on site in the Murchison, where signals from radio telescope antennas are combined and processed in a powerful system of computers called a correlator. What's left to do in Perth is produce images, and manage storage and distribution by the archive system so MWA astronomers can analyze the collected data.
Optical fiber links the Murchison Radio-astronomy Observatory (MRO) -- where the MWA is situated -- to the Pawsey Centre in Perth. Data travels down a dedicated 10 Gigabit per second connection between the MRO and Geraldton, and the trip to Perth is completed on Australia's new high-speed National Broadband Network.
The MWA will store about 3 Petabytes (3000TB) at the Pawsey Centre each year, equivalent to the MWA observing about a quarter of the time. Another section of the Pawsey Centre will be a supercomputing facility that includes computing for Australia's other SKA precursor, CSIRO's Australian Square Kilometre Array Pathfinder (ASKAP), and projects from geoscience and other computationally intensive fields.
"We're really impressed with iVEC and the staff from Cray and SGI," said Prof Wicenec. "We've been pushing to have resources ready as soon as possible so we could take advantage of Pawsey's capabilities for the MWA. They have provided us with very early access to the facility, which has let us interact directly with the experts and optimise the integration and setup of complex hardware and software systems."
"We're very grateful for the support from everyone in the team."