This document discusses the use of statistics and probabilities in corpus linguistics. It explains that statistics can provide useful tools for linguists to better understand languages. Probabilities in particular can be used to estimate word frequencies and develop probabilistic models of spelling. The document also discusses best practices for annotating corpora, including annotating with sufficient data to achieve statistical significance and avoiding errors like testing machine learning models on the same data they were trained on.