Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix control characters not being sanitized from shared strings in 1.3.6 #345

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion axlsx.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ require File.expand_path('../lib/axlsx/version', __FILE__)

Gem::Specification.new do |s|
s.name = 'axlsx'
s.version = Axlsx::VERSION
s.version = "1.3.6"
s.author = "Randy Morgan"
s.email = '[email protected]'
s.homepage = 'https://github.com/randym/axlsx'
Expand Down
8 changes: 8 additions & 0 deletions lib/axlsx.rb
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,14 @@ def self.camel(s="", all_caps = true)
s.gsub(/_(.)/){ $1.upcase }
end

# returns the provided string with all invalid control charaters
# removed.
# @param [String] str The sting to process
# @return [String]
def self.sanitize(str)
str.gsub(CONTROL_CHAR_REGEX, '')
end


# Instructs the serializer to not try to escape cell value input.
# This will give you a huge speed bonus, but if you content has <, > or other xml character data
Expand Down
2 changes: 1 addition & 1 deletion lib/axlsx/version.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module Axlsx

# The current version
VERSION = "1.3.6"
# VERSION = "1.3.6"
end
3 changes: 2 additions & 1 deletion lib/axlsx/workbook/shared_strings_table.rb
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,8 @@ def initialize(cells)
# @param [String] str
# @return [String]
def to_xml_string
'<?xml version="1.0" encoding="UTF-8"?><sst xmlns="' << XML_NS << '" count="' << @count.to_s << '" uniqueCount="' << unique_count.to_s << '">' << @shared_xml_string << '</sst>'
str = '<?xml version="1.0" encoding="UTF-8"?><sst xmlns="' << XML_NS << '" count="' << @count.to_s << '" uniqueCount="' << unique_count.to_s << '">' << @shared_xml_string << '</sst>'
str = Axlsx::sanitize(str)
end

private
Expand Down
10 changes: 1 addition & 9 deletions lib/axlsx/workbook/worksheet/worksheet.rb
Original file line number Diff line number Diff line change
Expand Up @@ -542,15 +542,7 @@ def to_xml_string
item.to_xml_string(str) if item
end
str << '</worksheet>'
sanitize(str)
end

# returns the provided string with all invalid control charaters
# removed.
# @param [String] str The sting to process
# @return [String]
def sanitize(str)
str.gsub(CONTROL_CHAR_REGEX, '')
Axlsx::sanitize(str)
end

# The worksheet relationships. This is managed automatically by the worksheet
Expand Down
14 changes: 14 additions & 0 deletions test/workbook/tc_shared_strings_table.rb
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,18 @@ def test_valid_document
assert_equal(errors.size, 0, "sharedStirngs.xml Invalid" + errors.map{ |e| e.message }.to_s)
end

def test_remove_control_characters_in_xml_serialization
nasties = "hello\x10\x00\x1C\x1Eworld"
@p.workbook.worksheets[0].add_row [nasties]

# test that the nasty string was added to the shared strings
assert @p.workbook.shared_strings.unique_cells.has_key?(nasties)

# test that none of the control characters are in the XML output for shared strings
assert_no_match Axlsx::CONTROL_CHAR_REGEX, @p.workbook.shared_strings.to_xml_string

# assert that the shared string was normalized to remove the control characters
assert_not_nil @p.workbook.shared_strings.to_xml_string.index("helloworld")
end

end