Skip to content

Memory optimization (~30% less)#633

Open
korun wants to merge 2 commits into
randym:masterfrom
korun:memory-refactor-final
Open

Memory optimization (~30% less)#633
korun wants to merge 2 commits into
randym:masterfrom
korun:memory-refactor-final

Conversation

@korun

@korun korun commented May 11, 2019

Copy link
Copy Markdown

Hi, this is tiny optimization without complete refactoring or something.

The main idea: decrease count of allocated objects (mostly strings) by some ruby tricks and freezing.

Modern ruby versions allow not only just prevent strings from mutation with freeze, but also allocate it only once:

# ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux]
2.2.2 :001 > 'asdads332'.freeze.object_id
# => 18294380 
2.2.2 :002 > 'asdads332'.freeze.object_id
# => 18294380 
2.2.2 :003 > 'asdads332'.freeze.object_id
# => 18294380 

If we need to support old ruby versions -- I suggest to consider it in a separate request.

Some measurements here: https://gist.github.com/korun/1f22ee8505122f2f307fc83dee3b9974


Refs #276

korun added 2 commits May 11, 2019 12:14
~ 24% more better performance on modern rubies
@gdimitris

Copy link
Copy Markdown

That looks very promising! Any memory relief is highly appreciated since this gem uses a lot of memory and causes memory issues when generating large excel files.
@randym is it possible to check for any regressions and create a new release ?
There is a pending memory issue since January 2014 #276

Thanks !

@gdimitris

gdimitris commented May 11, 2019

Copy link
Copy Markdown

@korun do you perhaps have any more tips/hints to reduce memory usage ? I am getting a lot of memory quota exceeded hits on Heroku on my project and face a lot of issues

@korun

korun commented May 11, 2019

Copy link
Copy Markdown
Author

@gdimitris, I think it can be implemented some "stream-mode", so axlsx will can write prepared data direct in temp file row-by-row. But it forbid some exists features, such as modifying row height, or styles, etc. after sheet.add_row was called. In this mode, all preparations must be made before row insertion.

@gdimitris

Copy link
Copy Markdown

@korun I am a bit confused, so you mean that streaming the rows would break existing styles in cells ? Or that they have to be applied differently ?

@korun

korun commented May 13, 2019

Copy link
Copy Markdown
Author

@gdimitris, for example, now we can do something like:

cells = sheet.add_row(
    ['Total', "Count: #{count}", *Array.new(6), "=SUM(I#{i}:I#{i + count})", nil, nil],
    style: row_table_cell_style(sheet)
)
cells[0].style = row_text_center_table_style(sheet)
cells[8].style = row_currency_style(sheet)

In "stream-mode" we have to do all preparations before row flush to file:

# flush to file enabled
cells = sheet.add_row(
    ['Total', "Count: #{count}", *Array.new(6), "=SUM(I#{i}:I#{i + count})", nil, nil],
    style: [
        row_text_center_table_style(sheet),
        *Array.new(7, row_table_cell_style(sheet)]),
        row_currency_style(sheet)
    ]
)
# at this line, all is good

# But, if we call:
cells[0].style = row_text_center_table_style(sheet)
# it raise some error, like "Cannot access to flushed row in stream-mode"

@gdimitris

Copy link
Copy Markdown

@korun Ok got it, thanks!
Do you believe implementing "streaming mode" is something that requires a lot of effort in terms of development ? If you are interested, we can collaborate and implement "streaming mode" together, since I do not see any recent activity from the creator of the gem. My guess is that a lot of people have memory issues with the gem and this would help a lot. Let me know if you are interested :D
I believe we can add it as a feature so it does not break existing functionality.

@klyonrad

klyonrad commented Oct 8, 2019

Copy link
Copy Markdown

@korun did you notice the "movement" that got started at https://github.com/caxlsx/caxlsx ? Can you submit the PR there as well, that could spike some big interest there :)

@gdimitris

Copy link
Copy Markdown

@klyonrad korun doesn't seem active but I can do this I think later in the day. It seems a very promising fix.

@gdimitris

Copy link
Copy Markdown

Turns out it was easier than I thought 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants