“W.hat you The first thing we notice is how quiet it is,” says Kimmo Koski, the head of the Finnish company. HE Science Center. Dr. Koski is describing LUMI—Snow in Finnish: the most powerful supercomputer in Europe, which is located 250 kilometers south of the Arctic Circle, in the city of Kajaani in Finland.
LUMI, which opened last year, is used for everything from climate modeling to the search for new drugs. It has tens of thousands of individual processors and is capable of performing up to 429 quadrillion calculations per second. That makes it the third most powerful supercomputer in the world. Powered by hydropower and with waste heat used to help heat homes in Kajaani, it even boasts negative carbon dioxide emissions.
LUMI offers a glimpse into the future of high-performance computing (HPC), both on dedicated supercomputers and in the cloud infrastructure that runs much of the Internet. During the last decade the demand for HPC has boomed, fueled by technologies like machine learning, genome sequencing, and simulations of everything from stock markets to nuclear weapons to the climate. It is likely to continue to increase, as these types of applications will happily consume all the computing power that can be offered to them. Over the same period, the amount of computing power needed to form a cutting-edge computer AI The model has been doubling every five months.
All of this has implications for the environment. HPC(and computing in general) is becoming a large consumer of energy. The International Energy Agency estimates that data centers account for between 1.5% and 2% of global electricity consumption, roughly the same as the entire British economy. That figure is expected to rise to 4% by 2030. With an eye on government promises to reduce greenhouse gas emissions, the computer industry is trying to find ways to do more with less and boost the efficiency of its products. . The work is carried out on three levels: that of the individual microchips; of the computers that are built from those chips; and the data centers that, in turn, house the computers.
Start with the microchips themselves. Digital computers have become much more efficient in the last 80 years. A modern machine can perform about 10 trillion calculations with the same amount of energy that a single calculation would have consumed after World War II. Much of that enormous progress was the result of the industry’s attempts to adhere to Moore’s Law: the observation that the number of components that can be squeezed into an integrated circuit doubles every two years.
When the chips are down
For several decades, a happy side effect of Moore’s Law was that as circuits got smaller, they also became more frugal. This effect is known as the Dennard scale, after Robert Dennard, a scientist then working at IBM who wrote a paper on the topic in 1974. However, in the mid-2000s, the complicated physics of ultra-small components meant that the relationship began to break down. Computers continue to become more efficient as their components shrink, but the rate at which they do so has slowed dramatically.
That has forced chipmakers to work harder in pursuit of profits they once got for free. He CPUsin LUMI(the general-purpose chips that run programs and coordinate the rest of the machine) are made by amd, an American chip designer. In addition to supercomputers, their CPUs, along with those from Intel, its biggest rival, power many of the data centers that power the Internet. In 2010, when Dennard’s growth was consigned to the history books, the company put improving energy efficiency “at the top of our priority list,” says Samuel Naffziger, product technology architect at amd.
Today, their chips use a number of tricks to try to keep power consumption down. They are covered in sensors that monitor and minimize the amount of power sent to parts of the circuit depending on the tasks assigned to them. Other improvements have focused on ensuring that as much of the chip as possible is doing useful work at any given time, since idle circuits waste energy for no purpose. amd It hopes that a combination of even smarter tricks and even smaller components will allow it to increase the efficiency of its most powerful chips by 30 times by 2025, compared to 2020.
Another option is to change the job from general purpose. CPUIt is specialized chips designed for a more limited range of mathematical tasks. The best known are the “graphics processing units”, or GPUs. Originally developed to produce more elegant graphics for video games, GPUs have proven to excel at many tasks that can be broken down into small parts, and then worked on each of them simultaneously. Similarly, specialized chips are increasingly taking on tasks like networking that would previously have been left to the computer. CPU deal with.
These system-level adjustments are the second scale at which efficiency can be improved. “When you play with thousands of CPUsand GPU“How they connect can make or break the power efficiency of a supercomputer,” says Justin Hotard, head of high-performance computing at Hewlett Packard Enterprise, a company that specializes in, among other things, efficient supercomputers.
Exactly how best to connect everything remains an active area of research. Sending a signal to another chip in another part of the computer consumes a large amount of power. Therefore, the goal is to minimize how often it happens and minimize the distance the signal has to travel when it happens. HPE It prefers something known as “dragonfly topology,” a two-layer system in which groups of chips are connected to each other in groups, and those groups are connected to each other in turn. The system is modular, which makes it easy to expand by simply adding new nodes. And a paper published in February by Francisco Andújar, a computer scientist at the University of Valladolid, and his colleagues, showed, after a lot of mathematical analysis, that the dragonfly configuration is close to the ideal design for efficient supercomputing. .
And efficiency doesn’t have to come at the cost of performance. Top500.org, a website, publishes rankings of supercomputers for both speed and efficiency. Their most recent list, published in June LUMI as the seventh most efficient machine in the world and the third fastest. Frontier, a computer installed at Oak Ridge National Laboratory in Tennessee, is by far the fastest in the world, about four times faster than LUMI. However, in terms of efficiency, Frontier ranks sixth.
The final scale at which profits can be made is that of the data center, the high-tech shed that houses both supercomputers and the more mundane servers that power the Internet. Computing produces a lot of heat. Despite the new focus on efficiency, a modern CPU either GPU It can produce 500 watts or more of heat at full speed. With tens of thousands of people in a single data center, that means several megawatts of heat.
Keeping them fresh requires energy. The standard measure of data center efficiency is the effectiveness of energy use (PUE), the relationship between the total energy consumption of the data center and the amount of that consumption that is used to perform useful work. According to the Uptime Institute, a HE advisors, a typical data center has a PUE of 1.58. That means that around two-thirds of your electricity goes to running your computers, while a third goes to running the data center itself, most of which will be consumed by your cooling systems.
Reaching the Finnish line
Clever design can reduce that number much further. Most existing data centers rely on air cooling. Liquid cooling offers better heat transfer, at the cost of additional engineering effort. Several startups even offer to completely submerge circuit boards in specially designed liquid baths. Thanks in part to the use of liquid cooling, Frontier has a PUE of 1.03. A reason LUMI It was built near the Arctic Circle to take advantage of the fresh subarctic air. A neighboring computer, built in the same facility, makes use of that free cooling to achieve a PUE rating of only 1.02. That means 98% of the electricity that goes in is converted into useful math. “This means pushing the limits of what is possible,” says Dr. Koski.
Even the best commercial data centers don’t reach those numbers. Google, for example, has an average PUE value of 1.1. The latest figures from the Uptime Institute, released in June, show that, after several years of steady improvement, global data center efficiency has been stagnant since 2018 (see chart). The main reason is economics, rather than computing. As demand for computing has increased, it makes sense for companies to keep older, less efficient infrastructure running longer.
What is currently simply a pleasure to have will soon become a legal requirement. Mindful of their carbon reduction goals, the governments of the United States, Britain and the European Union, among others, are considering new rules that could force data centers to be more efficient. A new German law would require a minimum PUE of 1.5 by 2027 and 1.3 by 2030. “We want LUMI “To illustrate how high-performance computing can cross the line to net-zero carbon emissions,” says Dr. Koski. Those who want advice could do worse than book a trip to Finland. ■
Curious about the world? To enjoy our mind-expanding science coverage, subscribe to Simply Science, our subscriber-only weekly newsletter.