I am doing other hpc stuff, so I am wondering why are you limited to 1 GPU? Wind...

freeone3000 · on May 30, 2020

The 24GB card mentioned is almost certainly a RTX Titan, which are $3000 each. Just the card. Second, training frameworks like Megatron can distribute to multiple GPUs in the same computer as if they were on different machines, but the naive trainer is greatly helped by NVLink in order to actually look the memory and greatly improve accuracy, which means V100s which are $5000 each. (Also, people use Linux for ML)

p1esk · on May 30, 2020

An average cost of an NLP researcher is probably around $300k/year. If buying 10 $5k cards makes them twice more productive, then it's a no brainer.

mattkrause · on May 31, 2020

The average cost of an academic NLP researcher is probably closer to $30k/year.

p1esk · on May 31, 2020

Most universities have access to supercomputers, including GPU clusters. But that's not the point, not every NLP problem requires experimenting with 175B parameter models.

Academic researchers shouldn't try to compete with Google or OpenAI in scaling up models. They should try to come up with new approaches. Our brains have been evolving under tight constraints (size, energy, noise, etc). Maybe a good academic problem to solve is "how can I do what GPT-3 does if I only have an 8 GPU workstation?" This might lead to all kinds of breakthroughs.

pas · on May 30, 2020

Funding limitations probably.