Jeromy Anglim's Blog: Psychology and Statistics


Tuesday, February 23, 2010

Getting Started with Sweave: R, LaTeX, Eclipse, StatET, & TeXlipse

Being able to press a single button that runs all your statistical analyses and integrates the output into your final report is a beautiful thing. If you have not already heard, this is what Sweave can do for you. However, getting your computer to run Sweave can be a little bit fiddly. Thus, this post: (1) sets out the benefits of Sweave; (2) sets out how to install and configure R, Sweave, and Eclipse on Windows; (3) lists resources for people wanting to learn more about how to use LaTeX and Sweave; and (4) lists some specific resources relevant to researchers in psychology wanting to use these tools.

OVERVIEW

What is Sweave?

To Sweave is to weave in S. To weave is to combine data analysis code and standard formatted text into a single self-describing document. R is a dialect of S. Thus, if you use R to do your statistical analyses and you want to automate the importation of analyses in R into your reports, Sweave may be the tool for you. For a longer description, see Friedrich Leisch's (2002) Sweave: Dynamic Generation of Statistical Reports Using Literate Data Analysis.

Why Sweave?

  • Reproducibility: The most important reason to adopt a tool like Sweave is to make your research more reproducible. The R code sets out exactly how the raw data is transformed into publication output. The Sweave document links this R output with the final report.
  • Efficiency: Statistical output is automatically incorporated into your report. There is no need to copy and paste output from your statistical analysis program into your report. If your data or analyses change, you can update your report with a single click instead of having to manually update every table and figure.
  • Reliability: The integration of analyses with the report reduces the chance of errors entering in through copying and pasting of statistical output into documents.
  • Education & Communication: By providing data analysis code for a report, this teaches others how to do similar analyses.
  • For an extended discussion, see Anthony Rossini and Friedrich Leisch's (2003) working paper Literate Statistical Practice.

Common Use Cases

  • Statistics Instructional Materials
  • Empirical reports, journal articles, book chapters, theses, etc.
Data sharing, literate programming, reproducible research, weaving: This is future of data analysis. Why not get on board now?

MY SWEAVE INSTALLATION AND CONFIGURATION

A strength of R, Sweave, and LaTeX is that they are cross platform tools that can be integrated together to support powerful data analytic workflows. These tools run on Windows, Linux, and Mac OS with a range of text editors and command line options. However, the flexibility in configurations presents a challenge. There is no single click installation file like "setup.exe". The tools need to be assembled. This section sets out how to install and configure a system for writing Sweave documents based around the Eclipse IDE and a Windows Operating System. It's not the only way to assemble a system to support Sweave, but for someone entrenched in the Windows world, I think its a good start.

1. Download and install R

R Project. I'm assuming you already use R, but if not you may wish to read my post on Getting Started with R.

2. Download and install a Latex distribution

There are several LaTeX distributions. I installed MikTex.

3.a Download and install Eclipse and the StatET and TeXlipse plugins

See the StatET Installation page for instructions on how to install StatET and Eclipse.
See also the links under "Getting Started" with StatET and Eclipse.

3.a Download and install Eclipse

See StatET Installation page; This assumes you have Java installed.

3.b Install the StatET plugin and the TeXlipse plugin

See StatET Installation page

3.c Configure the StatET plugin

See Longhow Lam's Eclipse and the R plug-in StatET

3.d Configure the TeXlipse plugin

The TeXclipse homepage lists general information. See specifically, the configuration page. My configuration could be abbreviated to: Go to Window - Preferences in Eclipse; Then, TeXlipse - Builder Settings; Then, enter the appropriate directory for your Bin directory of TeX distribution. In my case this was "D:\MiKTeX 2.8\miktex\bin" .

3.e Configure Sweave

  • Sweave.sty: Sweave is a built-in function in R. However, when you run Sweave, your LaTeX distribution needs to be able to find a file called "Sweave.sty". The file is stored in your R program files (e.g., "C:\Program Files\R\R-2.9.1\share\texmf"). A quick way to make it accessible is to place the file in your Eclipse project folder where the Rnw file is located. See this R-Help post for tips. UMN has some additional tips. (UPDATE: Bernd referred me to some additional material on linking Sweave.sty with MikTeX.
  • External Tools: Go to Run -- External Tools - External Tools configurations;
    Sweave Document Processing (R/LaTeX); Click New Button; Give it a name like "Sweave-PDF"
    Under the LaTeX tab change output format to pdf build commands pdflatex.exe

LEARNING LATEX

Using Sweave assumes that you know how to use LaTeX. If you just want to write LaTeX documents using Eclipse (without R code), you can go to File - New Project (Texlipse - LaTeX Project). Once you have a basic working environment, it's easy to experiment with all the details of LaTeX. Here are some web guides among the many that are available.

LEARNING SWEAVE

Sweave is fairly straightforward. In Eclipse you can start a new R Project and add an *.Rnw file to write your Sweave document. Then use the Document menu to convert the Sweave file to a TEX file, PDF file, etc. There are many more general resources on Sweave:

LATEX, SWEAVE, AND PSYCHOLOGY

Adopting LaTeX and Sweave presents several challenges related to somewhat discipline specific needs. These pertain particularly to the various style conventions expected for journal submission. The following are some useful resources: