0

I have a loop where I load 10 huge excel 2007 files into a list of data.frame thanks to openxlsx library. This loop crashes at the 4th file when no delay is added. Adding a 500ms delay after each read solves the problem.

Can you tell me why?

Even if openxlsx doesn't use Java (unlike xlsx library) I guess that it's a problem of "garbage collection" that grows more quickly than it decrease... If I am right, does it exist a solution more "scientific" than a simple delay? (something like << .jcall("java/lang/System",,"gc") >> which is usefull for Java based xlsx read loop?)

Here is the code part that crashes (with the sys.sleep line that solves the pb) :

Library(openxlsx)

...

for (index in 1:10) {
  print(sprintf("Loading file : %s", List_files[index]))
  Data[[List_years[index]]] <- readWorkbook(List_files[index], sheet=1, colNames=T, skipEmptyRows=T, detectDates=F)
  Sys.sleep(0.5)
}

Thanks.

Syl33
  • 183

1 Answers1

1

I found the solution for this issue yesterday . I fact, I've tried the "gc()" command without success... but this was because it is necessary to force the cleaning with the reset option. So "gc(reset=T)" works and is more efficient than the inserted delay.

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Syl33
  • 183