This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
economics:r:text-manipulation [2018/10/23 14:06]
Olivier Simard-Casanova [Create a dummy variable based on text]
— (current)
Line 1: Line 1:
-# Manipulate text variables 
-Many real life databases were not created for scientific or analytic purposes. In other words, they could be dirty/​messy. 
-This page is especially useful if you need to extract or work with string/text variables. 
-## Create a dummy variable based on text 
-Assume you have a string variable, and depending on the presence (or not) of some text, you want to create a new binary variable taking the value 0 or 1. 
-df$dummy <- as.numeric(2) 
-df$dummy[grepl("​a specific string",​ df$varToProcess,​ fixed = TRUE)] <- as.numeric(0) 
-df$dummy[grepl("​another specific string",​ df$varToProcess,​ fixed = TRUE)] <- as.numeric(1) 
-The first line populates the dummy variable with `2`, in order to capture potential errors: if after running this chunk of code, `dummy` still has `2`, it means something went probably wrong somewhere. 
-`grepl` is used to match a specific string in the variable to process (https://​stat.ethz.ch/​R-manual/​R-devel/​library/​base/​html/​grep.html). `fixed = TRUE` assures that this string is evaluated as text, not as a regular expression. 
  • Last modified: 11 months ago
  • by Olivier Simard-Casanova