dlab-berkeley
diff --git a/‎lessons/01_preprocessing.ipynb
Lines changed: 4 additions & 6 deletions b/‎lessons/01_preprocessing.ipynb
Lines changed: 4 additions & 6 deletions
@@ -362,7 +362,7 @@
    "source": [
     "### Lowercasing\n",
     "\n",
-    "While we acknowledge that the **casing** of words is informative, we often don't work in contexts where we can properly utilize this information.\n",
+    "While we acknowledge that a word's casing is informative, we often don't work in contexts where we can properly utilize this information.\n",
     "\n",
     "More often, the subsequent analysis we perform is **case-insensitive**. For instance, in frequency analysis, we want to account for various forms of the same word. Lowercasing the text data aids in this process and simplifies our analysis.\n",
     "\n",
@@ -435,9 +435,7 @@
     "\n",
     "Our goal in this workshop is not to provide a deep (or even shallow) dive into regex; instead, we want to expose you to them so that you are better prepared to do deep dives in the future!\n",
     "\n",
-    "The following example is a poem by William Wordsworth. Like many poems, the text may contain extra line breaks (i.e., newline characters, `\\n`) that we want to remove.\n",
-    "\n",
-    "Let's read the data in!"
+    "The following example is a poem by William Wordsworth. Like many poems, the text may contain extra line breaks (i.e., newline characters, `\\n`) that we want to remove."
    ]
   },
   {
@@ -1096,7 +1094,7 @@
     "\n",
     "The first package we'll be using is called **Natural Language Toolkit**, or `nltk`. \n",
     "\n",
-    "Let's install a couple modules within the package."
+    "Let's install a couple modules from the package."
    ]
   },
   {
@@ -1841,7 +1839,7 @@
     "\n",
     "In this section, we will demonstrate tokenization in **BERT** (Bidirectional Encoder Representations from Transformers), which utilizes a tokenization algorithm called [**WordPiece**](https://huggingface.co/learn/nlp-course/en/chapter6/6). \n",
     "\n",
-    "We will load the tokenizer of BERT from the package `transformers`, which hosts a number of Transformer-based LLMs (e.g., GPT-2). We won't go into the architecture of Transformer in this workshop, but feel free to check out the D-lab workshop on [GPT Fundamentals](https://github.com/dlab-berkeley/GPT-Fundamentals)!"
+    "We will load the tokenizer of BERT from the package `transformers`, which hosts a number of Transformer-based LLMs (e.g., BERT). We won't go into the architecture of Transformer in this workshop, but feel free to check out the D-lab workshop on [GPT Fundamentals](https://github.com/dlab-berkeley/GPT-Fundamentals)!"
    ]
   },
   {