From what I can tell, the non-member agreement contains the core and common language regarding what one can do with the data (excluding databases with their own license). Reproducing the data is prohibited, but an analysis of the data should be consistent with the license. The core wording is
User shall not publish, retransmit, display, redistribute, reproduce
or commercially exploit the Data in any form
with exceptions for short excerpts.
If the analysis produces "546869732069732074657874", that would be a violation of the license since that is just an encoding difference of the original text, whereby the text can be reproduced. At least one of the special licenses explicitly permits analysis which doesn't allow reconstruction of the text:
summaries, analyses and interpretations of the linguistic properties
of the Data may be derived and published provided it is not possible
to reconstruct the Data from such summaries
Another of the special licenses says something similar:
Summaries, analyses and interpretations of the linguistic properties
of the information may be derived and published, provided it is not
possible to reconstruct the information from these summaries.
but it uses the troubling term "information" rather that data – nobody knows what "information" is.
The MS Indian Language databases is more restricted and prohibits distributing derivative works without permission, which they might deem a mapping of the original data to be. The COMLEX databases requires permission to "redistribute any product or derivative work based on the Database" (emphasis added).