While there are many great resources to get help in R, sometimes you just need a second opinion. Here is where the many Internet help boards come in handy, most notably Stack Overflow.
Start posting on Stack Overflow and you will soon learn the importance of the minimum reproducible example (MRE). Without one, you will likely even be refused “service.”
So, what is an MWE? It is fairly self-descriptive — the smallest possible example that contains all the information necessary (in this case, for someone to help you with your code). Here’s a great walkthrough on the topic written specifically for R coding (fittingly posted to Stack Overflow).
In this example we are focusing on setting up a minimally reproducible data set, in our case a data frame. The above post suggests to use R’s built-in data frames to build an MWE, which is a great idea — in fact it negates the need for what we are going to do, which is sampling from these built-in data frames.
Regardless, I want to point out a cool alternative to build a minimally reproducible data frame in R. We will do this using four R functions: dput and get, then dump and source.
Dput and Dget
Let’s take the first five rows of the iris dataset. Using dput we will write the data frame iris5 to an ASCII text representation. You could then paste this code (that starts with structure()) into a help forum, and your responder can in turn assign this output to an object (I assigned mine to irisme.).
#for exampole - get first 5 rows of iris dataset iris5 <- head(iris, 5) #write to an ASCII text representation dput(iris5) #paste it back and assign to new object irismre <- structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5), Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6), Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4), Petal.Width = c(0.2, 0.2, 0.2, 0.2, 0.2), Species = structure(c(1L, 1L, 1L, 1L, 1L), .Label = c("setosa", "versicolor", "virginica" ), class = "factor")), .Names = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species"), row.names = c(NA, 5L), class = "data.frame") irismre
If your dataset is big your dput output might get pretty big. Of course, try to keep your minimally reproducible dataset small — that is the reason you are doing an MWE!
Rather than getting the ASCII text representation, you could save this information to an R object instead with the “file =” argument in dput. Then read it back with dget:
#or you can write to a file dput(iris5, file = "C:/RFiles/iris5.R") #and read it back irismre <- dget("C:/RFiles/iris5.R")
Dump and Source
In the above example we re-assigned the data frames to objects of our own choosing. With dump and source, R will save and load the object by their original names. So, in our example we save the file as the object name “iris5,” and when we load it back with source and list the objects in our environment with ls(), we will see iris5 again, even after removing it from our environment with rm().
#or use dump and source to keep the object same name x <- dump("iris5", file = "C:/RFiles/data.R") rm(iris5) source("C:/RFiles/data.R") ls()
Complete code below:
Leave a Reply