Title: | Read, Write, Format Excel 2007 and Excel 97/2000/XP/2003 Files |
---|---|
Description: | Provide R functions to read/write/format Excel 2007 and Excel 97/2000/XP/2003 file formats. |
Authors: | Adrian Dragulescu [aut], Cole Arendt [aut, cre] |
Maintainer: | Cole Arendt <[email protected]> |
License: | GPL-3 |
Version: | 0.6.5.9000 |
Built: | 2024-11-03 04:51:03 UTC |
Source: | https://github.com/colearendt/xlsx |
The xlsx
package gives programatic control of Excel files using R. A
high level API allows the user to read a sheet of an xlsx document into a
data.frame
and write a data.frame
to a file. Lower level
functionality permits the direct manipulation of sheets, rows and cells.
For example, the user has control to set colors, fonts, data formats, add
borders, hide/unhide sheets, add/remove rows, add/remove sheets, etc.
Behind the scenes, the xlsx
package uses a java library from the
Apache project, https://poi.apache.org/index.html. This Apache project
provides a Java API to Microsoft Documents (Excel, Word, PowerPoint,
Outlook, Visio, etc.) By using the rJava
package that links and
Java, we can piggyback on the excellent work already done by the folks at
the Apache project and provide this functionality in R. The xlsx
package uses only a subset of the Apache POI project, namely the one dealing
with Excel files. All the necessary jar files are kept in package
xlsxjars
that is imported by package xlsx
.
A collection of tests that can be used as examples are located in folder
/tests/
. They are a good source of examples of how to use the
package.
Please see https://github.com/colearendt/xlsx/ for a Wiki and the development version. To report a bug, use the Issues page at https://github.com/colearendt/xlsx/issues.
Package: | xlsx |
Type: | Package |
Version: | 0.6.0 |
Date: | 2015-11-29 |
License: | GPL-3 |
Apache POI project for Microsoft Excel format: https://poi.apache.org/components/spreadsheet/index.html.
The Java Doc detailing the classes: https://poi.apache.org/apidocs/index.html. This can be useful if you are looking for something that is not exposed in R as it may be available on the Java side. Inspecting the source code for some the the functions in this package can show you how to do it (even if you are Java shy.)
Workbook
for ways to work with Workbook
objects.
## Not run: library(xlsx) # example of reading xlsx sheets file <- system.file("tests", "test_import.xlsx", package = "xlsx") res <- read.xlsx(file, 2) # read the second sheet # example of writing xlsx sheets file <- paste(tempfile(), "xlsx", sep=".") write.xlsx(USArrests, file=file) ## End(Not run)
## Not run: library(xlsx) # example of reading xlsx sheets file <- system.file("tests", "test_import.xlsx", package = "xlsx") res <- read.xlsx(file, 2) # read the second sheet # example of writing xlsx sheets file <- paste(tempfile(), "xlsx", sep=".") write.xlsx(USArrests, file=file) ## End(Not run)
data.frame
to a sheet.Add a data.frame
to a sheet, allowing for different column styles.
Useful when constructing the spreadsheet from scratch.
addDataFrame( x, sheet, col.names = TRUE, row.names = TRUE, startRow = 1, startColumn = 1, colStyle = NULL, colnamesStyle = NULL, rownamesStyle = NULL, showNA = FALSE, characterNA = "", byrow = FALSE )
addDataFrame( x, sheet, col.names = TRUE, row.names = TRUE, startRow = 1, startColumn = 1, colStyle = NULL, colnamesStyle = NULL, rownamesStyle = NULL, showNA = FALSE, characterNA = "", byrow = FALSE )
x |
a |
sheet |
a |
col.names |
a logical value indicating if the column names of |
row.names |
a logical value indicating whether the row names of
|
startRow |
a numeric value for the starting row. |
startColumn |
a numeric value for the starting column. |
colStyle |
a list of |
colnamesStyle |
a |
rownamesStyle |
a |
showNA |
a boolean value to control how NA's are displayed on the
sheet. If |
characterNA |
a string value to control how character NA will be shown in the spreadsheet. |
byrow |
a logical value indicating if the data.frame should be added to the sheet in row wise fashion. |
Starting with version 0.5.0 this function uses the functionality provided by
CellBlock
which results in a significant improvement in performance
compared with a cell by cell application of setCellValue
and
with other previous atempts.
It is difficult to treat NA
's consistently between R and Excel via
Java. Most likely, users of Excel will want to see NA
's as blank
cells. In R character NA
's are simply characters, which for Excel
means "NA".
The default formats for Date and DateTime columns can be changed via the two
package options xlsx.date.format
and xlsx.datetime.format
.
They need to be specified in Java date format
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html.
None. The modification to the workbook is done in place.
Adrian Dragulescu
wb <- createWorkbook() sheet <- createSheet(wb, sheetName="addDataFrame1") data <- data.frame(mon=month.abb[1:10], day=1:10, year=2000:2009, date=seq(as.Date("1999-01-01"), by="1 year", length.out=10), bool=c(TRUE, FALSE), log=log(1:10), rnorm=10000*rnorm(10), datetime=seq(as.POSIXct("2011-11-06 00:00:00", tz="GMT"), by="1 hour", length.out=10)) cs1 <- CellStyle(wb) + Font(wb, isItalic=TRUE) # rowcolumns cs2 <- CellStyle(wb) + Font(wb, color="blue") cs3 <- CellStyle(wb) + Font(wb, isBold=TRUE) + Border() # header addDataFrame(data, sheet, startRow=3, startColumn=2, colnamesStyle=cs3, rownamesStyle=cs1, colStyle=list(`2`=cs2, `3`=cs2)) # to change the default date format use something like this # options(xlsx.date.format="dd MMM, yyyy") # Don't forget to save the workbook ... # saveWorkbook(wb, file)
wb <- createWorkbook() sheet <- createSheet(wb, sheetName="addDataFrame1") data <- data.frame(mon=month.abb[1:10], day=1:10, year=2000:2009, date=seq(as.Date("1999-01-01"), by="1 year", length.out=10), bool=c(TRUE, FALSE), log=log(1:10), rnorm=10000*rnorm(10), datetime=seq(as.POSIXct("2011-11-06 00:00:00", tz="GMT"), by="1 hour", length.out=10)) cs1 <- CellStyle(wb) + Font(wb, isItalic=TRUE) # rowcolumns cs2 <- CellStyle(wb) + Font(wb, color="blue") cs3 <- CellStyle(wb) + Font(wb, isBold=TRUE) + Border() # header addDataFrame(data, sheet, startRow=3, startColumn=2, colnamesStyle=cs3, rownamesStyle=cs1, colStyle=list(`2`=cs2, `3`=cs2)) # to change the default date format use something like this # options(xlsx.date.format="dd MMM, yyyy") # Don't forget to save the workbook ... # saveWorkbook(wb, file)
Add a hyperlink to a cell to point to an external resource.
addHyperlink( cell, address, linkType = c("URL", "DOCUMENT", "EMAIL", "FILE"), hyperlinkStyle = NULL )
addHyperlink( cell, address, linkType = c("URL", "DOCUMENT", "EMAIL", "FILE"), hyperlinkStyle = NULL )
cell |
a |
address |
a string pointing to the resource. |
linkType |
a the type of the resource. |
hyperlinkStyle |
a |
The cell needs to have content before you add a hyperlink to it. The contents of the cells don't need to be the same as the address of the hyperlink. See the examples.
None. The modification to the cell is done in place.
Adrian Dragulescu
wb <- createWorkbook() sheet1 <- createSheet(wb, "Sheet1") rows <- createRow(sheet1, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns ## Add hyperlinks to a cell cell <- cells[[1,1]] address <- "https://poi.apache.org/" setCellValue(cell, "click me!") addHyperlink(cell, address) # Don't forget to save the workbook ...
wb <- createWorkbook() sheet1 <- createSheet(wb, "Sheet1") rows <- createRow(sheet1, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns ## Add hyperlinks to a cell cell <- cells[[1,1]] address <- "https://poi.apache.org/" setCellValue(cell, "click me!") addHyperlink(cell, address) # Don't forget to save the workbook ...
Functions to manipulate cells.
createCell(row, colIndex = 1:5) getCells(row, colIndex = NULL, simplify = TRUE) setCellValue(cell, value, richTextString = FALSE, showNA = TRUE) getCellValue(cell, keepFormulas = FALSE, encoding = "unknown")
createCell(row, colIndex = 1:5) getCells(row, colIndex = NULL, simplify = TRUE) setCellValue(cell, value, richTextString = FALSE, showNA = TRUE) getCellValue(cell, keepFormulas = FALSE, encoding = "unknown")
row |
a list of row objects. See |
colIndex |
a numeric vector specifying the index of columns. |
simplify |
a logical value. If |
cell |
a |
value |
an R variable of length one. |
richTextString |
a logical value indicating if the value should be inserted into the Excel cell as rich text. |
showNA |
a logical value. If |
keepFormulas |
a logical value. If |
encoding |
A character value to set the encoding, for example "UTF-8". |
setCellValue
writes the content of an R variable into the cell.
Date
and POSIXct
objects are passed in as numerical values.
To format them as dates in Excel see CellStyle
.
These functions are not vectorized and should be used only for small
spreadsheets. Use CellBlock
functionality to efficiently read/write
parts of a spreadsheet.
createCell
creates a matrix of lists, each element of the list being
a java object reference to an object of type Cell representing an empty
cell. The dimnames of this matrix are taken from the names of the rows and
the colIndex
variable.
getCells
returns a list of java object references for all the cells
in the row if colIndex
is NULL
. If you want to extract only a
specific columns, set colIndex
to the column indices you are
interested.
getCellValue
returns the value in the cell as an R object. Type
conversions are done behind the scene. This function is not vectorized.
Adrian Dragulescu
To format cells, see CellStyle
. For rows see
Row
, for sheets see Sheet
.
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[['mixedTypes']] # get second sheet rows <- getRows(sheet) # get all the rows cells <- getCells(rows) # returns all non empty cells values <- lapply(cells, getCellValue) # extract the values # write the months of the year in the first column of the spreadsheet ind <- paste(2:13, ".2", sep="") mapply(setCellValue, cells[ind], month.name) #################################################################### # make a new workbook with one sheet and 5x5 cells wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") rows <- createRow(sheet, rowIndex=1:5) cells <- createCell(rows, colIndex=1:5) # populate the first column with Dates days <- seq(as.Date("2013-01-01"), by="1 day", length.out=5) mapply(setCellValue, cells[,1], days)
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[['mixedTypes']] # get second sheet rows <- getRows(sheet) # get all the rows cells <- getCells(rows) # returns all non empty cells values <- lapply(cells, getCellValue) # extract the values # write the months of the year in the first column of the spreadsheet ind <- paste(2:13, ".2", sep="") mapply(setCellValue, cells[ind], month.name) #################################################################### # make a new workbook with one sheet and 5x5 cells wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") rows <- createRow(sheet, rowIndex=1:5) cells <- createCell(rows, colIndex=1:5) # populate the first column with Dates days <- seq(as.Date("2013-01-01"), by="1 day", length.out=5) mapply(setCellValue, cells[,1], days)
Functions to create and style (not read) a block of cells. Use it to set/update cell values and cell styles in an efficient manner.
CellBlock(sheet, startRow, startColumn, noRows, noColumns, create = TRUE) is.CellBlock(cellBlock) ## Default S3 method: CellBlock(sheet, startRow, startColumn, noRows, noColumns, create = TRUE) CB.setColData( cellBlock, x, colIndex, rowOffset = 0, showNA = TRUE, colStyle = NULL ) CB.setRowData( cellBlock, x, rowIndex, colOffset = 0, showNA = TRUE, rowStyle = NULL ) CB.setMatrixData( cellBlock, x, startRow, startColumn, showNA = TRUE, cellStyle = NULL ) CB.setFill(cellBlock, fill, rowIndex, colIndex) CB.setFont(cellBlock, font, rowIndex, colIndex) CB.setBorder(cellBlock, border, rowIndex, colIndex)
CellBlock(sheet, startRow, startColumn, noRows, noColumns, create = TRUE) is.CellBlock(cellBlock) ## Default S3 method: CellBlock(sheet, startRow, startColumn, noRows, noColumns, create = TRUE) CB.setColData( cellBlock, x, colIndex, rowOffset = 0, showNA = TRUE, colStyle = NULL ) CB.setRowData( cellBlock, x, rowIndex, colOffset = 0, showNA = TRUE, rowStyle = NULL ) CB.setMatrixData( cellBlock, x, startRow, startColumn, showNA = TRUE, cellStyle = NULL ) CB.setFill(cellBlock, fill, rowIndex, colIndex) CB.setFont(cellBlock, font, rowIndex, colIndex) CB.setBorder(cellBlock, border, rowIndex, colIndex)
sheet |
a |
startRow |
a numeric value for the starting row. |
startColumn |
a numeric value for the starting column. |
noRows |
a numeric value to specify the number of rows for the block. |
noColumns |
a numeric value to specify the number of columns for the block. |
create |
If |
cellBlock |
a cell block object as returned by |
x |
the data you want to add to the cell block, a vector or a matrix depending on the function. |
colIndex |
a numeric vector specifiying the columns you want relative
to the |
rowOffset |
a numeric value for the starting row. |
showNA |
a logical value. If set to |
colStyle |
a |
rowIndex |
a numeric vector specifiying the rows you want relative to
the |
colOffset |
a numeric value for the starting column. |
rowStyle |
a |
cellStyle |
a |
fill |
a Fill object, as returned by |
font |
a Font object, as returned by |
border |
a Border object, as returned by |
Introduced in version 0.5.0 of the package, these functions speed up the
creation and styling of cells that are part of a "cell block" (a rectangular
shaped group of cells). Use the functions above if you want to create
efficiently a complex sheet with many styles. A simple by-column styling
can be done by directly using addDataFrame
. With the
functionality provided here you can efficiently style individual cells, see
the example.
It is difficult to treat NA
's consistently between R and Excel via
Java. Most likely, users of Excel will want to see NA
's as blank
cells. In R character NA
's are simply characters, which for Excel
means "NA".
If you try to set more data to the block than you have cells in the block, only the existing cells will be set.
Note that when modifying the style of a group of cells, the changes are made
to the pairs defined by (rowIndex, colIndex)
. This implies that the
length of rowIndex
and colIndex
are the same value. An
exception is made when either rowIndex
or colIndex
have length
one, when they will be expanded internally to match the length of the other
index.
Function CB.setMatrixData
works for numeric or character matrices.
If the matrix x
is not of numeric type it will be converted to a
character matrix.
For CellBlock
a cell block object.
For CB.setColData
, CB.setRowData
, CB.setMatrixData
,
CB.setFill
, CB.setFont
, CB.setBorder
nothing as he
modification to the workbook is done in place.
Adrian Dragulescu
wb <- createWorkbook() sheet <- createSheet(wb, sheetName="CellBlock") cb <- CellBlock(sheet, 7, 3, 1000, 60) CB.setColData(cb, 1:100, 1) # set a column CB.setRowData(cb, 1:50, 1) # set a row # add a matrix, and style it cs <- CellStyle(wb) + DataFormat("#,##0.00") x <- matrix(rnorm(900*45), nrow=900) CB.setMatrixData(cb, x, 10, 4, cellStyle=cs) # highlight the negative numbers in red fill <- Fill(foregroundColor = "red", backgroundColor="red") ind <- which(x < 0, arr.ind=TRUE) CB.setFill(cb, fill, ind[,1]+9, ind[,2]+3) # note the indices offset # set the border on the top row of the Cell Block border <- Border(color="blue", position=c("TOP", "BOTTOM"), pen=c("BORDER_THIN", "BORDER_THICK")) CB.setBorder(cb, border, 1, 1:1000) # Don't forget to save the workbook ... # saveWorkbook(wb, file)
wb <- createWorkbook() sheet <- createSheet(wb, sheetName="CellBlock") cb <- CellBlock(sheet, 7, 3, 1000, 60) CB.setColData(cb, 1:100, 1) # set a column CB.setRowData(cb, 1:50, 1) # set a row # add a matrix, and style it cs <- CellStyle(wb) + DataFormat("#,##0.00") x <- matrix(rnorm(900*45), nrow=900) CB.setMatrixData(cb, x, 10, 4, cellStyle=cs) # highlight the negative numbers in red fill <- Fill(foregroundColor = "red", backgroundColor="red") ind <- which(x < 0, arr.ind=TRUE) CB.setFill(cb, fill, ind[,1]+9, ind[,2]+3) # note the indices offset # set the border on the top row of the Cell Block border <- Border(color="blue", position=c("TOP", "BOTTOM"), pen=c("BORDER_THIN", "BORDER_THICK")) CB.setBorder(cb, border, 1, 1:1000) # Don't forget to save the workbook ... # saveWorkbook(wb, file)
Create and set cell styles.
CellStyle( wb, dataFormat = NULL, alignment = NULL, border = NULL, fill = NULL, font = NULL, cellProtection = NULL ) is.CellStyle(x) ## Default S3 method: CellStyle( wb, dataFormat = NULL, alignment = NULL, border = NULL, fill = NULL, font = NULL, cellProtection = NULL ) setCellStyle(cell, cellStyle) getCellStyle(cell)
CellStyle( wb, dataFormat = NULL, alignment = NULL, border = NULL, fill = NULL, font = NULL, cellProtection = NULL ) is.CellStyle(x) ## Default S3 method: CellStyle( wb, dataFormat = NULL, alignment = NULL, border = NULL, fill = NULL, font = NULL, cellProtection = NULL ) setCellStyle(cell, cellStyle) getCellStyle(cell)
wb |
a workbook object as returned by |
dataFormat |
a |
alignment |
a |
border |
a |
fill |
a |
font |
a |
cellProtection |
a |
x |
a |
cell |
a |
cellStyle |
a |
setCellStyle
sets the CellStyle
to one Cell
object.
You need to have a Workbook
object to attach a CellStyle
object to it.
Since OS X 10.5 Apple dropped support for AWT on the main thread, so essentially you cannot use any graphics classes in R on OS X 10.5 since R is single-threaded. (verbatim from Simon Urbanek). This implies that setting colors on Mac will not work as is! A set of about 50 basic colors are still available please see the javadocs.
For Excel 95/2000/XP/2003 the choice of colors is limited. See
INDEXED_COLORS_
for the list of available colors.
Unspecified values for arguments are taken from the system locale.
createCellStyle
creates a CellStyle object.
is.CellStyle
returns TRUE
if the argument is of class
"CellStyle" and FALSE
otherwise.
Adrian Dragulescu
## Not run: wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") rows <- createRow(sheet, rowIndex=1) cell.1 <- createCell(rows, colIndex=1)[[1,1]] setCellValue(cell.1, "Hello R!") cs <- CellStyle(wb) + Font(wb, heightInPoints=20, isBold=TRUE, isItalic=TRUE, name="Courier New", color="orange") + Fill(backgroundColor="lavender", foregroundColor="lavender", pattern="SOLID_FOREGROUND") + Alignment(h="ALIGN_RIGHT") setCellStyle(cell.1, cellStyle1) # you need to save the workbook now if you want to see this art ## End(Not run)
## Not run: wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") rows <- createRow(sheet, rowIndex=1) cell.1 <- createCell(rows, colIndex=1)[[1,1]] setCellValue(cell.1, "Hello R!") cs <- CellStyle(wb) + Font(wb, heightInPoints=20, isBold=TRUE, isItalic=TRUE, name="Courier New", color="orange") + Fill(backgroundColor="lavender", foregroundColor="lavender", pattern="SOLID_FOREGROUND") + Alignment(h="ALIGN_RIGHT") setCellStyle(cell.1, cellStyle1) # you need to save the workbook now if you want to see this art ## End(Not run)
Create cell styles.
## S3 method for class 'CellStyle' cs1 + object
## S3 method for class 'CellStyle' cs1 + object
cs1 |
a |
object |
an object to add. The object can be another
|
The style of the argument object takes precedence over the style of argument cs1.
A CellStyle object.
Adrian Dragulescu
## Not run: cs <- CellStyle(wb) + Font(wb, heightInPoints=20, isBold=TRUE, isItalic=TRUE, name="Courier New", color="orange") + Fill(backgroundColor="lavender", foregroundColor="lavender", pattern="SOLID_FOREGROUND") + Alignment(h="ALIGN_RIGHT") setCellStyle(cell.1, cellStyle1) # you need to save the workbook now if you want to see this art ## End(Not run)
## Not run: cs <- CellStyle(wb) + Font(wb, heightInPoints=20, isBold=TRUE, isItalic=TRUE, name="Courier New", color="orange") + Fill(backgroundColor="lavender", foregroundColor="lavender", pattern="SOLID_FOREGROUND") + Alignment(h="ALIGN_RIGHT") setCellStyle(cell.1, cellStyle1) # you need to save the workbook now if you want to see this art ## End(Not run)
These functions are not vectorized.
createCellComment(cell, string, author = NULL, visible = TRUE) removeCellComment(cell) getCellComment(cell)
createCellComment(cell, string, author = NULL, visible = TRUE) removeCellComment(cell) getCellComment(cell)
cell |
a |
string |
a string for the comment. |
author |
a string with the author's name |
visible |
a logical value. If |
createCellComment
creates a Comment
object.
getCellComment
returns a the Comment
object if it exists.
removeCellComment
removes a comment from the given cell.
Adrian Dragulescu
For cells, see Cell
. To format cells, see
CellStyle
.
wb <- createWorkbook() sheet1 <- createSheet(wb, "Sheet1") rows <- createRow(sheet1, rowIndex=1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns cell1 <- cells[[1,1]] setCellValue(cell1, 1) # add value 1 to cell A1 # create a cell comment createCellComment(cell1, "Cogito", author="Descartes") # extract the comments comment <- getCellComment(cell1) stopifnot(comment$getAuthor()=="Descartes") stopifnot(comment$getString()$toString()=="Cogito") # don't forget to save your workbook!
wb <- createWorkbook() sheet1 <- createSheet(wb, "Sheet1") rows <- createRow(sheet1, rowIndex=1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns cell1 <- cells[[1,1]] setCellValue(cell1, 1) # add value 1 to cell A1 # create a cell comment createCellComment(cell1, "Cogito", author="Descartes") # extract the comments comment <- getCellComment(cell1) stopifnot(comment$getAuthor()=="Descartes") stopifnot(comment$getString()$toString()=="Cogito") # don't forget to save your workbook!
Functions to force formula calculation or refresh of pivot tables when the Excel file is opened.
forceFormulaRefresh(file, output = NULL, verbose = FALSE) forcePivotTableRefresh(file, output = NULL, verbose = FALSE)
forceFormulaRefresh(file, output = NULL, verbose = FALSE) forcePivotTableRefresh(file, output = NULL, verbose = FALSE)
file |
the path of the source file where formulae/pivot table needs to be refreshed |
output |
the path of the output file. If it is |
verbose |
Whether to make logging more verbose |
forcePivotTableRefresh
forces pivot tables to be refreshed when the Excel file is opened.
forceFormulaRefresh
forces formulae to be recalculated when the Excel file is opened.
Does not return any results
Tom Kwong
# Patch a file where its pivot tables are not recalculated when the file is opened ## Not run: forcePivotTableRefresh("/tmp/file.xlsx") forcePivotTableRefresh("/tmp/file.xlsx", "/tmp/fixed_file.xlsx") ## End(Not run) # Patch a file where its formulae are not recalculated when the file is opened ## Not run: forceFormulaRefresh("/tmp/file.xlsx") forceFormulaRefresh("/tmp/file.xlsx", "/tmp/fixed_file.xlsx") ## End(Not run)
# Patch a file where its pivot tables are not recalculated when the file is opened ## Not run: forcePivotTableRefresh("/tmp/file.xlsx") forcePivotTableRefresh("/tmp/file.xlsx", "/tmp/fixed_file.xlsx") ## End(Not run) # Patch a file where its formulae are not recalculated when the file is opened ## Not run: forceFormulaRefresh("/tmp/file.xlsx") forceFormulaRefresh("/tmp/file.xlsx", "/tmp/fixed_file.xlsx") ## End(Not run)
Create an Alignment object, useful when working with cell styles.
is.Alignment(x) Alignment( horizontal = NULL, vertical = NULL, wrapText = FALSE, rotation = 0, indent = 0 )
is.Alignment(x) Alignment( horizontal = NULL, vertical = NULL, wrapText = FALSE, rotation = 0, indent = 0 )
x |
An Alignment object, as returned by |
horizontal |
a character value specifying the horizontal alignment.
Valid values come from constant |
vertical |
a character value specifying the vertical alignment. Valid
values come from constant |
wrapText |
a logical indicating if the text should be wrapped. |
rotation |
a numerical value indicating the degrees you want to rotate the text in the cell. |
indent |
a numerical value indicating the number of spaces you want to indent the text in the cell. |
Alignment
returns a list with components from the input
argument, and a class attribute "Alignment". Alignment objects are used
when constructing cell styles.
is.Alignment
returns TRUE
if the argument is of class
"Alignment" and FALSE
otherwise.
Adrian Dragulescu
CellStyle
for using the a Alignment
object.
# you can just use h for horizontal, since R does the matching for you a1 <- Alignment(h="ALIGN_CENTER", rotation=90) # centered and rotated!
# you can just use h for horizontal, since R does the matching for you a1 <- Alignment(h="ALIGN_CENTER", rotation=90) # centered and rotated!
Create an Border object, useful when working with cell styles.
is.Border(x) Border(color = "black", position = "BOTTOM", pen = "BORDER_THIN")
is.Border(x) Border(color = "black", position = "BOTTOM", pen = "BORDER_THIN")
x |
An Border object, as returned by |
color |
a character vector specifiying the font color. Any color names
as returned by |
position |
a character vector specifying the border position. Valid values are "BOTTOM", "LEFT", "TOP", "RIGHT". |
pen |
a character vector specifying the pen style. Valid values come
from constant |
The values for the color, position, or pen arguments are replicated to the longest of them.
Border
returns a list with components from the input
argument, and a class attribute "Border". Border objects are used when
constructing cell styles.
is.Border
returns TRUE
if the argument is of class "Border"
and FALSE
otherwise.
Adrian Dragulescu
CellStyle
for using the a Border
object.
border <- Border(color="red", position=c("TOP", "BOTTOM"), pen=c("BORDER_THIN", "BORDER_THICK"))
border <- Border(color="red", position=c("TOP", "BOTTOM"), pen=c("BORDER_THIN", "BORDER_THICK"))
Create a CellProtection object used for cell styles.
is.CellProtection(x) CellProtection(locked = TRUE, hidden = FALSE)
is.CellProtection(x) CellProtection(locked = TRUE, hidden = FALSE)
x |
A CellProtection object, as returned by |
locked |
a logical indicating the cell is locked. |
a logical indicating the cell is hidden. |
CellProtection
returns a list with components from the input
argument, and a class attribute "CellProtection". CellProtection objects
are used when constructing cell styles.
is.CellProtection
returns TRUE
if the argument is of class
"CellProtection" and FALSE
otherwise.
Adrian Dragulescu
CellStyle
for using the a CellProtection
object.
font <- CellProtection(locked=TRUE)
font <- CellProtection(locked=TRUE)
Create an DataFormat object, useful when working with cell styles.
is.DataFormat(df) DataFormat(x)
is.DataFormat(df) DataFormat(x)
df |
An DataFormat object, as returned by |
x |
a character value specifying the data format. |
Specifying the dataFormat
argument allows you to format the cell.
For example, "#,##0.00" corresponds to using a comma separator for powers of
1000 with two decimal places, "m/d/yyyy" can be used to format dates and is
the equivalent of 's MM/DD/YYYY format. To format datetimes use "m/d/yyyy
h:mm:ss;@". To show negative values in red within parantheses with two
decimals and commas after power of 1000 use "#,##0.00_);[Red](#,##0.00)". I
am not aware of an official way to discover these strings. I find them out
by recording a macro that formats a specific cell and then checking out the
resulting VBA code. From there you can read the dataFormat
code.
DataFormat
returns a list one component dataFormat, and a
class attribute "DataFormat". DataFormat objects are used when constructing
cell styles.
is.DataFormat
returns TRUE
if the argument is of class
"DataFormat" and FALSE
otherwise.
Adrian Dragulescu
CellStyle
for using the a DataFormat
object.
df <- DataFormat("#,##0.00")
df <- DataFormat("#,##0.00")
Create an Fill object, useful when working with cell styles.
is.Fill(x) Fill( foregroundColor = "lightblue", backgroundColor = "lightblue", pattern = "SOLID_FOREGROUND" )
is.Fill(x) Fill( foregroundColor = "lightblue", backgroundColor = "lightblue", pattern = "SOLID_FOREGROUND" )
x |
a Fill object, as returned by |
foregroundColor |
a character vector specifiying the foreground color.
Any color names as returned by |
backgroundColor |
a character vector specifiying the foreground color.
Any color names as returned by |
pattern |
a character vector specifying the fill pattern style. Valid
values come from constant |
Fill
returns a list with components from the input argument,
and a class attribute "Fill". Fill objects are used when constructing cell
styles.
is.Fill
returns TRUE
if the argument is of class "Fill" and
FALSE
otherwise.
Adrian Dragulescu
CellStyle
for using the a Fill
object.
fill <- Fill()
fill <- Fill()
Create a Font object.
is.Font(x) Font( wb, color = NULL, heightInPoints = NULL, name = NULL, isItalic = FALSE, isStrikeout = FALSE, isBold = FALSE, underline = NULL, boldweight = NULL )
is.Font(x) Font( wb, color = NULL, heightInPoints = NULL, name = NULL, isItalic = FALSE, isStrikeout = FALSE, isBold = FALSE, underline = NULL, boldweight = NULL )
x |
A Font object, as returned by |
wb |
a workbook object as returned by |
color |
a character specifiying the font color. Any color names as
returned by |
heightInPoints |
a numeric value specifying the font height. Usual values are 10, 12, 14, etc. |
name |
a character value for the font to use. All values that you see in Excel should be available, e.g. "Courier New". |
isItalic |
a logical indicating the font should be italic. |
isStrikeout |
a logical indicating the font should be stiked out. |
isBold |
a logical indicating the font should be bold. |
underline |
a numeric value specifying the thickness of the underline. Allowed values are 0, 1, 2. |
boldweight |
a numeric value indicating bold weight. Normal is 400, regular bold is 700. |
Default values for NULL
parameters are taken from Excel. So the
default font color is black, the default font name is "Calibri", and the
font height in points is 11.
For Excel 95/2000/XP/2003, it is impossible to set the font to bold. This limitation may be removed in the future.
NOTE: You need to have a Workbook
object to attach a Font
object to it.
Font
returns a list with a java reference to a Font
object, and a class attribute "Font".
is.Font
returns TRUE
if the argument is of class "Font" and
FALSE
otherwise.
Adrian Dragulescu
CellStyle
for using the a Font
object.
## Not run: font <- Font(wb, color="blue", isItalic=TRUE) ## End(Not run)
## Not run: font <- Font(wb, color="blue", isItalic=TRUE) ## End(Not run)
These functions are provided for convenience only. Use directly the Java API to access additional functionality.
createRange(rangeName, firstCell, lastCell) getRanges(wb) readRange(range, sheet, colClasses = "character")
createRange(rangeName, firstCell, lastCell) getRanges(wb) readRange(range, sheet, colClasses = "character")
rangeName |
a character specifying the name of the name to create. |
firstCell |
a cell object corresponding to the top left cell in the range. |
lastCell |
a cell object corresponding to the bottom right cell in the range. |
wb |
a workbook object as returned by |
range |
a range object as returned by |
sheet |
a sheet object as returned by |
colClasses |
the type of the columns supported. Only |
getRanges
returns the existing ranges as a list.
readRange
reads the range into a data.frame.
createRange
returns the created range object.
Adrian Dragulescu
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheet <- getSheets(wb)[["deletedFields"]] ranges <- getRanges(wb) # the call below fails on cran tests for MacOS. You should see the # FAQ: https://code.google.com/p/rexcel/wiki/FAQ #res <- readRange(ranges[[1]], sheet, colClasses="numeric") # read it ranges[[1]]$getNameName() # get its name # see all the available java methods that you can call rJava::.jmethods(ranges[[1]]) # create a new named range firstCell <- sheet$getRow(14L)$getCell(4L) lastCell <- sheet$getRow(20L)$getCell(7L) rangeName <- "Test2" # same issue on MacOS #createRange(rangeName, firstCell, lastCell)
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheet <- getSheets(wb)[["deletedFields"]] ranges <- getRanges(wb) # the call below fails on cran tests for MacOS. You should see the # FAQ: https://code.google.com/p/rexcel/wiki/FAQ #res <- readRange(ranges[[1]], sheet, colClasses="numeric") # read it ranges[[1]]$getNameName() # get its name # see all the available java methods that you can call rJava::.jmethods(ranges[[1]]) # create a new named range firstCell <- sheet$getRow(14L)$getCell(4L) lastCell <- sheet$getRow(20L)$getCell(7L) rangeName <- "Test2" # same issue on MacOS #createRange(rangeName, firstCell, lastCell)
Function autoSizeColumn
expands the column width to match the column
contents thus removing the ###### that you get when cell contents are larger
than cell width.
addAutoFilter(sheet, cellRange) addMergedRegion(sheet, startRow, endRow, startColumn, endColumn) removeMergedRegion(sheet, ind) autoSizeColumn(sheet, colIndex) createFreezePane( sheet, rowSplit, colSplit, startRow = NULL, startColumn = NULL ) createSplitPane( sheet, xSplitPos = 2000, ySplitPos = 2000, startRow = 1, startColumn = 1, position = "PANE_LOWER_LEFT" ) setColumnWidth(sheet, colIndex, colWidth) setPrintArea(wb, sheetIndex, startColumn, endColumn, startRow, endRow) setZoom(sheet, numerator = 100, denominator = 100)
addAutoFilter(sheet, cellRange) addMergedRegion(sheet, startRow, endRow, startColumn, endColumn) removeMergedRegion(sheet, ind) autoSizeColumn(sheet, colIndex) createFreezePane( sheet, rowSplit, colSplit, startRow = NULL, startColumn = NULL ) createSplitPane( sheet, xSplitPos = 2000, ySplitPos = 2000, startRow = 1, startColumn = 1, position = "PANE_LOWER_LEFT" ) setColumnWidth(sheet, colIndex, colWidth) setPrintArea(wb, sheetIndex, startColumn, endColumn, startRow, endRow) setZoom(sheet, numerator = 100, denominator = 100)
sheet |
a |
cellRange |
a string specifying the cell range. For example a standard area ref (e.g. "B1:D8"). May be a single cell ref (e.g. "B5") in which case the result is a 1 x 1 cell range. May also be a whole row range (e.g. "3:5"), or a whole column range (e.g. "C:F") |
startRow |
a numeric value for the starting row. |
endRow |
a numeric value for the ending row. |
startColumn |
a numeric value for the starting column. |
endColumn |
a numeric value for the ending column. |
ind |
a numeric value indicating which merged region you want to remove. |
colIndex |
a numeric vector specifiying the columns you want to auto size. |
rowSplit |
a numeric value for the row to split. |
colSplit |
a numeric value for the column to split. |
xSplitPos |
a numeric value for the horizontal position of split in 1/20 of a point. |
ySplitPos |
a numeric value for the vertical position of split in 1/20 of a point. |
position |
a character. Valid value are "PANE_LOWER_LEFT", "PANE_LOWER_RIGHT", "PANE_UPPER_LEFT", "PANE_UPPER_RIGHT". |
colWidth |
a numeric value to specify the width of the column. The units are in 1/256ths of a character width. |
wb |
a |
sheetIndex |
a numeric value for the worksheet index. |
numerator |
a numeric value representing the numerator of the zoom ratio. |
denominator |
a numeric value representing the denomiator of the zoom ratio. |
You may need other functionality that is not exposed. Take a look at the java docs and the source code of these functions for how you can implement it in R.
addMergedRegion
returns a numeric value to label the merged
region. You should use this value as the ind
if you want to
removeMergedRegion
.
Adrian Dragulescu
wb <- createWorkbook() sheet1 <- createSheet(wb, "Sheet1") rows <- createRow(sheet1, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns ## Merge cells setCellValue(cells[[1,1]], "A title that spans 3 columns") addMergedRegion(sheet1, 1, 1, 1, 3) ## Set zoom 2:1 setZoom(sheet1, 200, 100) sheet2 <- createSheet(wb, "Sheet2") rows <- createRow(sheet2, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns #createFreezePane(sheet2, 1, 1, 1, 1) createFreezePane(sheet2, 5, 5, 8, 8) sheet3 <- createSheet(wb, "Sheet3") rows <- createRow(sheet3, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns createSplitPane(sheet3, 2000, 2000, 1, 1, "PANE_LOWER_LEFT") # set the column width of first column to 25 characters wide setColumnWidth(sheet1, 1, 25) # add a filter on the 3rd row, columns C:E addAutoFilter(sheet1, "C3:E3") # Don't forget to save the workbook ...
wb <- createWorkbook() sheet1 <- createSheet(wb, "Sheet1") rows <- createRow(sheet1, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns ## Merge cells setCellValue(cells[[1,1]], "A title that spans 3 columns") addMergedRegion(sheet1, 1, 1, 1, 3) ## Set zoom 2:1 setZoom(sheet1, 200, 100) sheet2 <- createSheet(wb, "Sheet2") rows <- createRow(sheet2, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns #createFreezePane(sheet2, 1, 1, 1, 1) createFreezePane(sheet2, 5, 5, 8, 8) sheet3 <- createSheet(wb, "Sheet3") rows <- createRow(sheet3, 1:10) # 10 rows cells <- createCell(rows, colIndex=1:8) # 8 columns createSplitPane(sheet3, 2000, 2000, 1, 1, "PANE_LOWER_LEFT") # set the column width of first column to 25 characters wide setColumnWidth(sheet1, 1, 25) # add a filter on the 3rd row, columns C:E addAutoFilter(sheet1, "C3:E3") # Don't forget to save the workbook ...
For now, the following image types are supported: dib, emf, jpeg, pict, png, wmf. Please note, that scaling works correctly only for workbooks with the default font size (Calibri 11pt for .xlsx). If the default font is changed the resized image can be streched vertically or horizontally.
addPicture(file, sheet, scale = 1, startRow = 1, startColumn = 1)
addPicture(file, sheet, scale = 1, startRow = 1, startColumn = 1)
file |
the absolute path to the image file. |
sheet |
a worksheet object as returned by |
scale |
a numeric specifying the scale factor for the image. |
startRow |
a numeric specifying the row of the upper left corner of the image. |
startColumn |
a numeric specifying the column of the upper left corner of the image. |
Don't know how to remove an existing image yet.
addPicture
a java object references pointing to the picture.
Adrian Dragulescu
file <- system.file("tests", "log_plot.jpeg", package = "xlsx") wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") addPicture(file, sheet) # don't forget to save the workbook!
file <- system.file("tests", "log_plot.jpeg", package = "xlsx") wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") addPicture(file, sheet) # don't forget to save the workbook!
Document some Apache POI constants used in the project.
HALIGN_STYLES_ VALIGN_STYLES_ BORDER_STYLES_ FILL_STYLES_ CELL_STYLES_ INDEXED_COLORS_
HALIGN_STYLES_ VALIGN_STYLES_ BORDER_STYLES_ FILL_STYLES_ CELL_STYLES_ INDEXED_COLORS_
An object of class numeric
of length 7.
An object of class numeric
of length 4.
An object of class numeric
of length 14.
An object of class numeric
of length 19.
An object of class numeric
of length 44.
An object of class numeric
of length 48.
A named vector.
Adrian Dragulescu
CellStyle
for using the POI_constants
.
Other settings are available but not exposed. Please see the java docs.
printSetup( sheet, fitHeight = NULL, fitWidth = NULL, copies = NULL, draft = NULL, footerMargin = NULL, headerMargin = NULL, landscape = FALSE, pageStart = NULL, paperSize = NULL, noColor = NULL )
printSetup( sheet, fitHeight = NULL, fitWidth = NULL, copies = NULL, draft = NULL, footerMargin = NULL, headerMargin = NULL, landscape = FALSE, pageStart = NULL, paperSize = NULL, noColor = NULL )
sheet |
a worksheet object |
fitHeight |
numeric value to set the number of pages high to fit the sheet in. |
fitWidth |
numeric value to set the number of pages wide to fit the sheet in. |
copies |
numeric value to set the number of copies. |
draft |
logical indicating if it's a draft or not. |
footerMargin |
numeric value to set the footer margin. |
headerMargin |
numeric value to set the header margin. |
landscape |
logical value to specify the paper orientation. |
pageStart |
numeric value from where to start the page numbering. |
paperSize |
character to set the paper size. Valid values are "A4_PAPERSIZE", "A5_PAPERSIZE", "ENVELOPE_10_PAPERSIZE", "ENVELOPE_CS_PAPERSIZE", "ENVELOPE_DL_PAPERSIZE", "ENVELOPE_MONARCH_PAPERSIZE", "EXECUTIVE_PAPERSIZE", "LEGAL_PAPERSIZE", "LETTER_PAPERSIZE". |
noColor |
logical value to indicate if the prints should be color or not. |
A reference to a java PrintSetup object.
Adrian Dragulescu
wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") ps <- printSetup(sheet, landscape=TRUE, copies=3)
wb <- createWorkbook() sheet <- createSheet(wb, "Sheet1") ps <- printSetup(sheet, landscape=TRUE, copies=3)
data.frame
.The read.xlsx
function provides a high level API for reading data
from an Excel worksheet. It calls several low level functions in the
process. Its goal is to provide the conveniency of
read.table
by borrowing from its signature.
read.xlsx( file, sheetIndex, sheetName = NULL, rowIndex = NULL, startRow = NULL, endRow = NULL, colIndex = NULL, as.data.frame = TRUE, header = TRUE, colClasses = NA, keepFormulas = FALSE, encoding = "unknown", password = NULL, ... ) read.xlsx2( file, sheetIndex, sheetName = NULL, startRow = 1, colIndex = NULL, endRow = NULL, as.data.frame = TRUE, header = TRUE, colClasses = "character", password = NULL, ... )
read.xlsx( file, sheetIndex, sheetName = NULL, rowIndex = NULL, startRow = NULL, endRow = NULL, colIndex = NULL, as.data.frame = TRUE, header = TRUE, colClasses = NA, keepFormulas = FALSE, encoding = "unknown", password = NULL, ... ) read.xlsx2( file, sheetIndex, sheetName = NULL, startRow = 1, colIndex = NULL, endRow = NULL, as.data.frame = TRUE, header = TRUE, colClasses = "character", password = NULL, ... )
file |
the path to the file to read. |
sheetIndex |
a number representing the sheet index in the workbook. |
sheetName |
a character string with the sheet name. |
rowIndex |
a numeric vector indicating the rows you want to extract.
If |
startRow |
a number specifying the index of starting row. For
|
endRow |
a number specifying the index of the last row to pull. If
|
colIndex |
a numeric vector indicating the cols you want to extract.
If |
as.data.frame |
a logical value indicating if the result should be
coerced into a |
header |
a logical value indicating whether the first row corresponding
to the first element of the |
colClasses |
For |
keepFormulas |
a logical value indicating if Excel formulas should be shown as text in and not evaluated before bringing them in. |
encoding |
encoding to be assumed for input strings. See
|
password |
a String with the password. |
... |
other arguments to |
The function pulls the value of each non empty cell in the worksheet into a
vector of type list
by preserving the data type. If
as.data.frame=TRUE
, this vector of lists is then formatted into a
rectangular shape. Special care is needed for worksheets with ragged data.
An attempt is made to guess the class type of the variable corresponding to
each column in the worksheet from the type of the first non empty cell in
that column. If you need to impose a specific class type on a variable, use
the colClasses
argument. It is recommended to specify the column
classes and not rely on R
to guess them, unless in very simple cases.
Excel internally stores dates and datetimes as numeric values, and does not
keep track of time zones and DST. When a datetime column is brought into ,
it is converted to POSIXct
class with a GMT timezone.
Occasional rounding errors may appear and the and Excel string
representation my differ by one second. For read.xlsx2
bring in a
datetime column as a numeric one and then convert to class POSIXct
or
Date
. Also rounding the POSIXct
column in R usually does the
trick too.
The read.xlsx2
function does more work in Java so it achieves better
performance (an order of magnitude faster on sheets with 100,000 cells or
more). The result of read.xlsx2
will in general be different from
read.xlsx
, because internally read.xlsx2
uses
readColumns
which is tailored for tabular data.
Reading of password protected workbooks is supported for Excel 2007 OOXML format only.
A data.frame or a list, depending on the as.data.frame
argument. If some of the columns are read as NA's it's an indication that
the colClasses
argument has not been set properly.
If the sheet is empty, return NULL
. If the sheet does not exist,
return an error.
Adrian Dragulescu
write.xlsx
for writing xlsx
documents. See
also readColumns
for reading only a set of columns into R.
## Not run: file <- system.file("tests", "test_import.xlsx", package = "xlsx") res <- read.xlsx(file, 1) # read first sheet head(res) # NA. Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area # 1 Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708 # 2 Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432 # 3 Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417 # 4 Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945 # 5 California 21198 5114 1.1 71.71 10.3 62.6 20 156361 # 6 Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766 # > # To convert an Excel datetime colum to POSIXct, do something like: # as.POSIXct((x-25569)*86400, tz="GMT", origin="1970-01-01") # For Dates, use a conversion like: # as.Date(x-25569, origin="1970-01-01") res2 <- read.xlsx2(file, 1) ## End(Not run)
## Not run: file <- system.file("tests", "test_import.xlsx", package = "xlsx") res <- read.xlsx(file, 1) # read first sheet head(res) # NA. Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area # 1 Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708 # 2 Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432 # 3 Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417 # 4 Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945 # 5 California 21198 5114 1.1 71.71 10.3 62.6 20 156361 # 6 Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766 # > # To convert an Excel datetime colum to POSIXct, do something like: # as.POSIXct((x-25569)*86400, tz="GMT", origin="1970-01-01") # For Dates, use a conversion like: # as.Date(x-25569, origin="1970-01-01") res2 <- read.xlsx2(file, 1) ## End(Not run)
Read a contiguous set of columns from sheet into an R data.frame. Uses the
RInterface
for speed.
readColumns( sheet, startColumn, endColumn, startRow, endRow = NULL, as.data.frame = TRUE, header = TRUE, colClasses = NA, ... )
readColumns( sheet, startColumn, endColumn, startRow, endRow = NULL, as.data.frame = TRUE, header = TRUE, colClasses = NA, ... )
sheet |
a |
startColumn |
a numeric value for the starting column. |
endColumn |
a numeric value for the ending column. |
startRow |
a numeric value for the starting row. |
endRow |
a numeric value for the ending row. If |
as.data.frame |
a logical value indicating if the result should be
coerced into a |
header |
a logical value indicating whether the first row corresponding
to the first element of the |
colClasses |
a character vector that represents the class of each
column. Recycled as necessary, or if |
... |
other arguments to |
Use the readColumns
function when you want to read a rectangular
block of data from an Excel worksheet. If you request columns which are
blank, these will be read in as empty character "" columns. Internally, the
loop over columns is done in R, the loop over rows is done in Java, so this
function achieves good performance when number of rows >> number of columns.
Excel internally stores dates and datetimes as numeric values, and does not
keep track of time zones and DST. When a numeric column is formatted as a
datetime, it will be converted into POSIXct
class with a GMT
timezone. If you need a Date
column, you need to specify explicitly
using colClasses
argument.
For a numeric column Excels's errors and blank cells will be returned as NaN
values. Excel's #N/A
will be returned as NA. Formulas will be
evaluated. For a chracter column, blank cells will be returned as "".
A data.frame or a list, depending on the as.data.frame
argument.
Adrian Dragulescu
read.xlsx2
for reading entire sheets. See also
addDataFrame
for writing a data.frame
to a sheet.
## Not run: file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[["all"]] res <- readColumns(sheet, startColumn=3, endColumn=10, startRow=3, endRow=7) sheet <- sheets[["NAs"]] res <- readColumns(sheet, 1, 6, 1, colClasses=c("Date", "character", "integer", rep("numeric", 2), "POSIXct")) ## End(Not run)
## Not run: file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[["all"]] res <- readColumns(sheet, startColumn=3, endColumn=10, startRow=3, endRow=7) sheet <- sheets[["NAs"]] res <- readColumns(sheet, 1, 6, 1, colClasses=c("Date", "character", "integer", rep("numeric", 2), "POSIXct")) ## End(Not run)
Read a contiguous set of rows into an R character matrix. Uses the
RInterface
for speed.
readRows(sheet, startRow, endRow, startColumn, endColumn = NULL)
readRows(sheet, startRow, endRow, startColumn, endColumn = NULL)
sheet |
a |
startRow |
a numeric value for the starting row. |
endRow |
a numeric value for the ending row. If |
startColumn |
a numeric value for the starting column. |
endColumn |
a numeric value for the ending column. Empty cells will be returned as "". |
Use the readRows
function when you want to read a row or a block
block of data from an Excel worksheet. Internally, the loop over rows is
done in R, and the loop over columns is done in Java, so this function
achieves good performance when number of rows << number of columns.
In general, you should prefer the function readColumns
over
this one.
A character matrix.
Adrian Dragulescu
read.xlsx2
for reading entire sheets. See also
addDataFrame
for writing a data.frame
to a sheet.
## Not run: file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[["all"]] res <- readRows(sheet, startRow=3, endRow=7, startColumn=3, endColumn=10) ## End(Not run)
## Not run: file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[["all"]] res <- readRows(sheet, startRow=3, endRow=7, startColumn=3, endColumn=10) ## End(Not run)
removeRow
is just a convenience wrapper to remove the rows from the
sheet (before saving). Internally it calls lapply
.
createRow(sheet, rowIndex = 1:5) getRows(sheet, rowIndex = NULL) removeRow(sheet, rows = NULL) setRowHeight(rows, inPoints, multiplier = NULL)
createRow(sheet, rowIndex = 1:5) getRows(sheet, rowIndex = NULL) removeRow(sheet, rows = NULL) setRowHeight(rows, inPoints, multiplier = NULL)
sheet |
a worksheet object as returned by |
rowIndex |
a numeric vector specifying the index of rows to create.
For |
rows |
a list of |
inPoints |
a numeric value to specify the height of the row in points. |
multiplier |
a numeric value to specify the multiple of default row
height in points. If this value is set, it takes precedence over the
|
For getRows
a list of java object references each pointing to
a row. The list is named with the row number.
Adrian Dragulescu
To extract the cells from a given row, see Cell
.
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[[2]] rows <- getRows(sheet) # get all the rows # see all the available java methods that you can call rJava::.jmethods(rows[[1]]) # for example rows[[1]]$getRowNum() # zero based index in Java removeRow(sheet, rows) # remove them all # create some row rows <- createRow(sheet, rowIndex=1:5) setRowHeight( rows, multiplier=3) # 3 times bigger rows than the default
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[[2]] rows <- getRows(sheet) # get all the rows # see all the available java methods that you can call rJava::.jmethods(rows[[1]]) # for example rows[[1]]$getRowNum() # zero based index in Java removeRow(sheet, rows) # remove them all # create some row rows <- createRow(sheet, rowIndex=1:5) setRowHeight( rows, multiplier=3) # 3 times bigger rows than the default
Java sets the java temp directory to '/tmp' by default. However, this is usually not desirable in R. As a result, this function allows changing that behavior. Further, this function is fired on package load to ensure all temp files are written to the R temp directory.
set_java_tmp_dir(tmp_dir = tempdir()) get_java_tmp_dir()
set_java_tmp_dir(tmp_dir = tempdir()) get_java_tmp_dir()
tmp_dir |
optional. The new temp directory. Defaults to the R temp directory |
On package load, we use 'getOption("xlsx.tempdir", tempdir())' for the default value, in case you want to have this value set by an option.
The previous java temp directory (prior to any changes).
Functions to manipulate worksheets.
getSheets(wb) createSheet(wb, sheetName = "Sheet1") removeSheet(wb, sheetName = "Sheet1")
getSheets(wb) createSheet(wb, sheetName = "Sheet1") removeSheet(wb, sheetName = "Sheet1")
wb |
a workbook object as returned by |
sheetName |
a character specifying the name of the worksheet to create, or remove. |
createSheet
returns the created Sheet
object.
getSheets
returns a list of java object references each pointing to
an worksheet. The list is named with the sheet names.
Adrian Dragulescu
To extract rows from a given sheet, see Row
.
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[[2]] # extract the second sheet # see all the available java methods that you can call rJava::.jmethods(sheet) # for example sheet$getLastRowNum()
file <- system.file("tests", "test_import.xlsx", package = "xlsx") wb <- loadWorkbook(file) sheets <- getSheets(wb) sheet <- sheets[[2]] # extract the second sheet # see all the available java methods that you can call rJava::.jmethods(sheet) # for example sheet$getLastRowNum()
createWorkbook
creates an empty workbook object.
createWorkbook(type = "xlsx") loadWorkbook(file, password = NULL) saveWorkbook(wb, file, password = NULL)
createWorkbook(type = "xlsx") loadWorkbook(file, password = NULL) saveWorkbook(wb, file, password = NULL)
type |
a String, either |
file |
the path to the file you intend to read or write. Can be an xls or xlsx format. |
password |
a String with the password. |
wb |
a workbook object as returned by |
loadWorkbook
loads a workbook from a file.
saveWorkbook
saves an existing workook to an Excel 2007 file.
Reading or writing of password protected workbooks is supported for Excel 2007 OOXML format only. Note that in Linux, LibreOffice is not able to read password protected spreadsheets.
createWorkbook
returns a java object reference pointing to an
empty workbook object.
loadWorkbook
creates a java object reference corresponding to the
file to load.
Adrian Dragulescu
write.xlsx
for writing a data.frame
to an
xlsx
file. read.xlsx
for reading the content of a
xlsx
worksheet into a data.frame
. To extract worksheets and
manipulate them, see Worksheet
.
wb <- createWorkbook() # see all the available java methods that you can call rJava::.jmethods(wb) # for example wb$getNumberOfSheets() # no sheet yet! # loadWorkbook("C:/Temp/myFile.xls")
wb <- createWorkbook() # see all the available java methods that you can call rJava::.jmethods(wb) # for example wb$getNumberOfSheets() # no sheet yet! # loadWorkbook("C:/Temp/myFile.xls")
Write a data.frame
to an Excel workbook.
write.xlsx( x, file, sheetName = "Sheet1", col.names = TRUE, row.names = TRUE, append = FALSE, showNA = TRUE, password = NULL ) write.xlsx2( x, file, sheetName = "Sheet1", col.names = TRUE, row.names = TRUE, append = FALSE, password = NULL, ... )
write.xlsx( x, file, sheetName = "Sheet1", col.names = TRUE, row.names = TRUE, append = FALSE, showNA = TRUE, password = NULL ) write.xlsx2( x, file, sheetName = "Sheet1", col.names = TRUE, row.names = TRUE, append = FALSE, password = NULL, ... )
x |
a |
file |
the path to the output file. |
sheetName |
a character string with the sheet name. |
col.names |
a logical value indicating if the column names of |
row.names |
a logical value indicating whether the row names of
|
append |
a logical value indicating if |
showNA |
a logical value. If set to |
password |
a String with the password. |
... |
other arguments to |
This function provides a high level API for writing a data.frame
to
an Excel 2007 worksheet. It calls several low level functions in the
process. Its goal is to provide the conveniency of
write.csv
by borrowing from its signature.
Internally, write.xlsx
uses a double loop in over all the elements of
the data.frame
so performance for very large data.frame
may be
an issue. Please report if you experience slow performance. Dates and
POSIXct classes are formatted separately after the insertion. This also
adds to processing time.
If x
is not a data.frame
it will be converted to one.
Function write.xlsx2
uses addDataFrame
which speeds up the
execution compared to write.xlsx
by an order of magnitude for large
spreadsheets (with more than 100,000 cells).
The default formats for Date and DateTime columns can be changed via the two
package options xlsx.date.format
and xlsx.datetime.format
.
They need to be specified in Java date format
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html.
Writing of password protected workbooks is supported for Excel 2007 OOXML format only. Note that in Linux, LibreOffice is not able to read password protected spreadsheets.
Adrian Dragulescu
read.xlsx
for reading xlsx
documents. See
also addDataFrame
for writing a data.frame
to a sheet.
## Not run: file <- paste(tempdir(), "/usarrests.xlsx", sep="") res <- write.xlsx(USArrests, file) # to change the default date format oldOpt <- options() options(xlsx.date.format="dd MMM, yyyy") write.xlsx(x, sheet) # where x is a data.frame with a Date column. options(oldOpt) # revert back to defaults ## End(Not run)
## Not run: file <- paste(tempdir(), "/usarrests.xlsx", sep="") res <- write.xlsx(USArrests, file) # to change the default date format oldOpt <- options() options(xlsx.date.format="dd MMM, yyyy") write.xlsx(x, sheet) # where x is a data.frame with a Date column. options(oldOpt) # revert back to defaults ## End(Not run)