Reading and Writing Data to and from R


Reading files into R

Ordinarily we volition be using information already in a file that nosotros need to read into R in order to piece of work on it. R can read data from a variety of file formats—for example, files created equally text, or in Excel, SPSS or Stata. We will mainly be reading files in text format .txt or .csv (comma-separated, commonly created in Excel).

To read an entire data frame directly, the external file will unremarkably accept a special form

  • The first line of the file should accept a name for each variable in the information frame.
  • Each additional line of the file has as its first detail a row label and the values for each variable.

Here we use the example dataset called airquality.csv and airquality.txt

Input file class with names and row labels:

Ozone Solar.R * Wind Temp Month Day

1 41 ***** 190 ** vii.4 ** 67 **** five ** 1

two 36 ***** 118 ** 8.0 ** 72 **** 5 ** two

three 12 ***** 149 * 12.half-dozen ** 74 **** 5 ** three

4 18 ***** 313 * 11.5 ** 62 **** 5 ** 4

five NA ***** NA ** 14.3 ** 56 **** five ** five

   ...

By default numeric items (except row labels) are read as numeric variables. This can be changed if necessary.

The office read.table() can and then be used to read the information frame straight

     > airqual <- read.table("C:/Desktop/airquality.txt")

Similarly, to read .csv files the read.csv() function can be used to read in the information frame straight

[Note: I have noticed that occasionally yous'll need to do a double slash in your path //. This seems to depend on the automobile.]

> airqual <- read.csv("C:/Desktop/airquality.csv")

 In addition, yous can read in files using the file.cull() office in R. Afterwards typing in this command in R, you tin manually select the directory and file where your dataset is located.

  1. Read the airquality.csv file into R using the read.csv command.
  2. Read the airquality.txt file into R using the file.cull() command

Occasionally, you lot volition demand to read in data that does not already have column proper noun information.  For instance, the dataset BOD.txt looks similar this:

1    8.iii

two   10.3

three   19.0

iv   16.0

5   15.6

7   19.eight

Initially, in that location are no column names associated with the dataset.  We tin can apply the colnames() control to assign column names to the dataset.  Suppose that nosotros desire to assign columns, "Time" and "demand" to the BOD.txt dataset.  To exercise and then nosotros practise the following

> bod <- read.tabular array("BOD.txt", header=F)

> colnames(bod) <- c("Time","demand")

> colnames(bod)

[1] "Fourth dimension"   "demand"

The first command reads in the dataset, the command "header=F" specifies that at that place are no cavalcade names associated with the dataset.

Read in the cars.txt dataset and call it car1.  Brand sure you use the "header=F" option to specify that there are no column names associated with the dataset.  Side by side, assign "speed" and "dist" to be the first and 2nd column names to the car1 dataset.

The two videos below provide a overnice explanations of dissimilar methods to read data from a spreadsheet into an R dataset.

Import Data, Copy Information from Excel to R, Both .csv and .txt Formats (R Tutorial i.3) MarinStatsLectures [Contents]

alternative accessible content

Importing Data and Working With Data in R (R Tutorial 1.four) MarinStatsLectures [Contents]

alternative accessible content

Writing Information to a File


Subsequently working with a dataset, we might similar to save it for future use. Earlier we practise this, let's first ready a working directory so nosotros know where nosotros can find all our data sets and files later.

Setting upwardly a Directory

In the R window, click on "File" and and then on "Change dir". You should then come across a box pop upwards titled "Cull directory". For this course, cull the directory "Desktop" by clicking on "Scan", and so select "Desktop" and click "OK". In the future, yous may desire to create a directory on your computer where you continue your information sets and codes for this class.

Alternatively, yous can use the setwd() role to assign every bit working directory.

> setwd("C:/Desktop")

To detect out what your current working directory is, type

> getwd()

Setting Up Working Directories in R (R Tutorial ane.8) MarinStatsLectures [Contents]

alternative accessible content

In R, we can write data frames hands to a file, using the write.table() command.

> write.tabular array(cars1, file=" cars1.txt ", quote=F)

The first argument refers to the information frame to be written to the output file, the second is the proper noun of the output file. By default R will environment each entry in the output file by quotes, so we use quote=F.

Now, let'south check whether R created the file on the Desktop, by going to the Desktop and clicking to open the file. You should see a file with three columns, the commencement giving the index (or row number) and the other two the speed and distance. R past default creates a column of row indices. If we wanted to create a file without the row indices, we would apply the command:

> write.tabular array(cars1, file=" cars1.txt ", quote=F, row.names=F)

Datasets in R


Watch the video below for a concise intoduction to working with the variables in an R dataset

Working with Variables and Data in R (R Tutorial 1.5) MarinStatsLecures [Contents]

alternative accessible content

Effectually 100 datasets are supplied with R (in the package datasets), and others are available.

To see the list of datasets currently available utilize the command:

data()

Nosotros volition outset look at a data assault CO2 (carbon dioxide) uptake in grass plants available in R.

> CO2

[ Note: capitalization matters here; also: it'due south the letter O, non zilch. Typing this command should display the entire dataset chosen CO2, which has 84 observations (in rows) and 5 variables (columns).]

To get more than data on the variables in the dataset, type in

> help(CO2)

Evaluate and study the mean and standard deviation of the variables "Concentration" and "Uptake".

Subsetting Data in R With Square Brackets and Logic Statements (R Tutorial 1.6) MarinStatsLecures [Contents]

alternative accessible content