I travel to communities of minority language speakers and do language documentation and description. Language documentation is done through recording word lists and texts which are later compiled into corpora. Language description primarily entails studying grammar. often through the use of questionnaires. For example Dahl’s questionnaire is a common starting point in describing tense and aspect system. Here’s first few sentences of the questionnaire. The speakers are asked to translate the sentences into their language and, based on their response we learn how the tense system in their language functions. Typically it’s just a starting point and after the first few days of work with a specific questionnaire, adjustments and additional questions are made to better suit the target language. We also construct sentences in the target language and ask the speakers if the sentences are grammatical and what is their meaning. This part is very important because while a lot can be learned from text collections (and it’s usually a better source than questionnaires), it’s impossible to know from the texts alone whether a specific construction is ungrammatical or simply rare.
While I mostly do purely academic research, the results can also used to make textbooks and other learning materials. Annotated text collections are also valuable both for scholars and for heritage language learners.
For the sake of anonymity I don’t want to name specific languages I study. I work with two Uralic languages, one has a few thousand speakers, another has ~100-150 speakers, both are severely endangered by UNESCO classification.
One person to edit 250 AI articles per week? I’d be very surprised if they found them.