PC Magazine - USA (2020-10)

%XWWKHEHQH¿WVRIODUJHUQHXUDOQHWZRUNVFRPHZLWKWUDGHR̆V7KH more parameters and layers you add to a neural network, the more expensive its training becomes. According to an estimate by Chuan Li, WKH&KLHI6FLHQFH2̇FHURI/DPEGDDSURYLGHURIKDUGZDUHDQGFORXG resources for deep learning, it could take up to 355 years and $4.6 million to train GPT-3 on a server with a V100 graphics card.

³2XUFDOFXODWLRQZLWKD9*38LVH[WUHPHO\VLPSOL¿HG,QSUDFWLFH you can’t train GPT-3 on a single GPU, but with a distributed system with many GPUs like the one OpenAI used,” Li says. “One will never get perfect scaling in a large distributed system due to the overhead of device-to-device communication. So in practice, it will take more than PLOOLRQWR¿QLVKWKHWUDLQLQJF\FOH ́

7KLVHVWLPDWHLVVWLOOVLPSOL¿HG7UDLQLQJDQHXUDOQHWZRUNLVKDUGO\D one-shot process. It involves a lot of trial and error, and engineers must often change the settings and retrain the network to obtain optimal performance.

“There are certainly behind-the-scenes costs as well: parameter tuning, WKHSURWRW\SLQJWKDWLWWDNHVWRJHWD¿QLVKHGPRGHOWKHFRVWRI researchers, so it certainly was expensive to create GPT-3,” says Nick Walton, the co-founder of Latitude and the creator of AI dungeon, a text-based game created on GPT-2.

Walton said that the real cost of the research behind GPT-3 could be DQ\ZKHUHEHWZHHQWRWLPHVWKHFRVWRIWUDLQLQJWKH¿QDOPRGHOEXW he added, “It’s really hard to say without knowing what their process looks like internally.”

GOING TO A FOR-PROFIT MODEL 2SHQ$,ZDVIRXQGHGLQODWHDVDQRQSUR¿WUHVHDUFKODEZLWKWKH PLVVLRQWRGHYHORSKXPDQOHYHO$,IRUWKHEHQH¿WRIDOOKXPDQLW\ Among its founders were Tesla CEO Elon Musk and Sam Altman, IRUPHU<&RPELQDWRUSUHVLGHQWZKRFROOHFWLYHO\GRQDWHGELOOLRQWR the lab’s research. Altman later became the CEO of OpenAI.

B

e

n

D

i

c

k

s

o

n

PC Magazine - USA (2020-10)

B

e

n

D

i

c

k

s

o

n

Get our desktop app

Company

Features

Documentation

Resources