My understanding is that the binding constraint in training these models is the quantity of computation they consume. While a human makes do with drastically less input data, we also have drastically more computational resources in our heads to work on the problem than Google is using to train its models.