When manually reviewing content, there are two massive issues with discovering if content is redundant at any but the smallest scale: 1) if there are multiple people reviewing for redundancy then it is difficult to be using the same definition (since it can be very subjective) and 2) it's impossible for one person (much less multiple) to know about all the combinations of pages on the site.
Note that lower level fields may sometimes be needed to compute more useful fields. Also, sometimes the higher level fields may be more difficult to compute, so they are not always worth it.
Chimera can calculate the more concrete Near Text Duplicate.