• circuitfarmer@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 months ago

      There are a great many languages which are undocumented entirely or are severely lacking in documentation. One part of my job is collecting data for such languages. Another part is more traditional computational linguistics, which in my case is primarily corpus analysis (still a relatively common step in the development of model training data).