Researchers at Linköping University have developed a method to increase by a factor of five the computing power of a standard algorithm when performed in one type of standard chip, FPGA. The new method is both simple and smart, but the road to publication has been long.
We are dealing with a programmable integrated circuit known as an "FPGA", which is an abbreviation for "field-programmable gate array". This consists of a matrix of logical gates that can be programmed in situ, and can be reprogrammed an unlimited number of times. The first FPGAs came onto the market in 1985, and sales since then have increased dramatically. The market is now dominated by a couple of major players, and is expected to amount to USD 9.8 billion in 2020 (Wikipedia). The researchers have increased in these chips the speed of an algorithm known as the "fast Fourier transform", which is used in spectral analysis, radar technology and telecommunication.
"Until now, people have believed that once an FPGA is full it cannot accommodate any more. If you want new functionality in this case, you have to completely rebuild the hardware, which is expensive," says Oscar Gustafsson, senior lecturer in the Department of Computer Engineering at Linköping University.
But Carl Ingemarsson, a PhD student at the department, had other ideas. As an undergraduate several years ago, he was challenged to increase the speed of calculation in an FPGA. If the lab group could manage to reach a frequency greater than 450 MHz, they wouldn't have to carry out the final lab in the course.
"This was what was needed to convince me to examine in depth the way the logic is represented inside the chip," he says.
He achieved the frequency, skipped the final lab, and at the same time laid the foundation for his doctoral project. The result is that FPGAs today can be made to work five times as fast, or to deal with five times the number of calculations. While it's true that Carl has only confirmed this in two families of FPGA, there is no reason to believe that it is not also the case for all other families.
"This advance will save huge sums for demanding calculations in industry, and will make it possible to implement new functionality without needing to replace the hardware," says Oscar Gustafsson.
Carl Ingemarsson's method is based on ensuring that the signal takes a smarter route through the various building blocks inside the chip.
"Normally, you choose an algorithm that can carry out the desired calculations, and then build up the structure, the architecture, using the required blocks. This is then transferred to the FPGA. But we have also looked at how the logic is built up, the routes the signals take, and what happens to them inside the chip. We have then adapted the architecture and the mapping onto the chip using the results of this analysis."
A clever change in the signal routes gives the chip a capacity that is five times greater for each hardware unit.
"It should be possible to automate this optimisation of the chip," says Carl Ingemarsson.
The method was, however too simple, or too ingenious, for the scientific reviewers.
"At one level, it might seem that we haven't changed anything, we're still using the same standard components, but we have increased the computing power by a factor of five. This has made it has difficult to get our article published in a scientific journal," Oscar Gustafsson explains.
But the solution was so clever that someone managed to plagiarise the work before the IEEE decided to publish it. It suddenly appeared at an IEEE conference, using copies of the diagrams, with parts of the text swapped out and completely different authors. All the support documentation in the form of original files and original diagrams was, however, available at LiU: the plagiarism was discovered, and the researcher suspended. The damage had been done, however, and publication of the original article was delayed by at least a year.
The article: Efficient FPGA Mapping of Pipeline SDF FFT Cores, Carl Ingemarsson, Petter Källström, Fahad Qureshi and Oscar Gustafsson, IEEE Transactions on Very Large Scale Integration Systems 2017, DOI 10.1109/TVLSI.2017.2710479