AI surpassing humans on a benchmark that is named after a general ability is not the same as AI surpassing humans on that general ability. 

For example, just because a benchmark has “reading comprehension” in its name doesn’t mean that it tests general reading comprehension.

Go to Source