TSMC and Graphcore partnership accelerates AI training by over 40 percent

0 0

By Matthew Griffin Intelligence and the Senses 16th March 2022

WHY THIS MATTERS IN BRIEF

As computing platforms get more powerful, and as chips evolve, we’ll enter into a virtuous cycle of continuous AI acceleration and improvement.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

First we created computers, then we created Artificial Intelligence (AI), and now AI is creating computers that help create even better, faster AI’s … it’s an all too familiar tale.

Now though putting the humans back in the loop again UK based AI computer company Graphcore, which has been hailed as the future of computing by none other than the founder of ARM, made a significant boost to its computers’ performance without changing much of anything about its specialised brain-like AI processor cores. The secret was to use TSMC’s wafer-on-wafer 3D integration technology during manufacture to attach a power-delivery chip to Graphcore’s AI processor.

The world's first AI produced music album breaks cover

The new combined chip, called Bow, for a district in London, is the first on the market to use wafer-on-wafer bonding, say Graphcore executives. The addition of the power-delivery silicon means Bow can run faster – 1.85 gigahertz versus 1.35 GHz – and at lower voltage than its predecessor. That translates to computers that train neural nets up to 40 percent faster with as much as 16 percent less energy compared to its previous generation. Importantly, users get this improvement with no change to their software at all.

“We are entering an era of advanced packaging in which multiple silicon die are going to be assembled together to supplement the performance advantages we can get from increasing progress along an ever-slowing Moore’s Law path,” says Simon Knowles, Graphcore Chief Technical Officer and cofounder. Both Bow and its predecessor the Colossus MK2 were made using the same manufacturing technology, TSMC’s N7.

Boeing will trial its first self-piloting planes in 2018

In other 3D-chip-stacking technology, such as Intel’s Foveros, already excised chips are attached to other chips or to wafers. In TSMC’s SoIC WoW technology, two entire wafers of chips are bonded. The chips on each have copper pads that match up when the wafers are aligned. When the two wafers are pressed together, the pads fuse.

“You can think of this as a kind of cold weld between the pads,” says Knowles. The top wafer is then thinned down to just a few micrometers and the bonded wafer is diced up into chips.

In Graphcore’s case, one wafer is full of the company’s second generation AI processor – the company calls them IPUs, for Intelligence Processing Units – with 1,472 IPU cores and 900 megabytes of on-chip memory. These processors were already in use in commercial systems and made a good showing in the last round of MLPerf benchmark tests. The other wafer had a corresponding set of power-delivery chips. These chips carry no transistors or other active components. Instead, they are packed with capacitors and vertical connections called through-silicon vias. The latter make power and data connections that pass through the power chip to the processor die.

From AI lawyers to AI judges courts are embracing tech for better and worse

It’s the capacitors that really make the difference. These components are formed in deep, narrow trenches in the silicon, exactly like the bit-storing capacitors in DRAM. By placing these reservoirs of charge so close to the transistors, power delivery is smoothed out, allowing the IPU cores to run faster at lower voltage. Without the power-delivery chip, the IPU would have to increase its operating voltage above its nominal level to work at 1.85 GHz, consuming a lot more power. With the power chip, it can reach that clock-rate and consume less power, too.

Graphcore executives say wafer-on-wafer technology results in a higher density of connections between the chips than attaching individual chips to a wafer. However, one long-standing concern with this technique was the “known good die” problem. That is, there are always a few chips in a batch of wafers that are flawed. Bonding two wafers would then as much as double the resulting number of flawed chips.

New breakthrough gives AI a human memory

Graphcore’s way around this is to let it happen, to a degree. Like some other new AI processors, the IPU is made up of many repeated, and therefore redundant, processor cores and other parts. Any duds can be cut off from the rest of the IPU by means of built-in fuses, says Nigel Toon, Graphcore cofounder and CEO.

Although the new product has no transistors on the power-delivery chip, those might be coming. Using the technology only for power delivery “is just the first step for us,” says Knowles. “It will go much further than that in the near future.”

Graphcore revealed some plans for that near future, announcing that it will build supercomputers that can train “brain-scale” AIs – those having hundreds of trillions of parameters in a neural network. The “Good” computer, named in honor of British mathematician I.J. “Jack” Good, would be an exascale supercomputer capable of more than 10 exaflops – 10 billion billion floating-point operations. Good would be made up of 512 systems with 8,192 IPUs along with mass storage, CPUs, and networking. It will have 4 petabytes of memory and a bandwidth of more than 10 PB per second. Graphcore estimates each supercomputer will cost about US $120 million and should be ready for delivery in 2024.

Walmart's suppliers would rather negotiate with AI than a human

“When we started Graphcore…the idea has always been in the back of our mind to build an ultra-intelligent computer that would surpass the capability of a human brain,” says Toon. “And that is what we are now working on.”

Competitor Cerebras Systems meanwhile has also already planted its flag in the quest for brain-scale AI. It developed an external memory system and a way to connect multiple computers that would allow its computers to train neural networks with hundreds of trillions of parameters.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.