-
-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nokogiri_error_raise
as a libxml2 handler considered harmful
#1610
Comments
Thanks for opening this issue! I'll take a look.
…On Mar 1, 2017 4:12 PM, "pcapcanari" ***@***.***> wrote:
Memory leak is observed while searching in a document with undefined
namespace prefix.
# Nokogiri (1.7.0.1)
---
warnings: []
nokogiri: 1.7.0.1
ruby:
version: 2.3.1
platform: x86_64-linux
description: ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
engine: ruby
libxml:
binding: extension
source: packaged
libxml2_path: "/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/nokogiri-1.7.0.1/ports/x86_64-pc-linux-gnu/libxml2/2.9.4"
libxslt_path: "/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/nokogiri-1.7.0.1/ports/x86_64-pc-linux-gnu/libxslt/1.1.29"
libxml2_patches: []
libxslt_patches: []
compiled: 2.9.4
loaded: 2.9.4
Reproduced with:
require 'nokogiri'while true do
begin
#parse simple xml. no namespace definition
doc = Nokogiri::XML('<ns1:Root></ns1:Root>')
# below code raises exception and leaks memory
body = doc.search('//ns1:Root').first
puts body.to_xml
rescue => e
puts e
end end
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1610>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AAAgD2jPxXM6kfIeaRJbE38Ji1lrOTwJks5rhd9SgaJpZM4MQNWE>
.
|
I've reproduced this behavior. Digging in. |
OK, this leak appears to come from using
|
Nokogiri_error_raise
considered harmful
I've audited the C code for non-trivial raises, use of handlers, and general patterns of error handling: places where
|
Nokogiri_error_raise
considered harmfulNokogiri_error_raise
as a libxml2 handler considered harmful
I'll further note that this is the script I used to generate the samplings of code I filtered through above (in case I need to re-do the analysis after so much time): #!/usr/bin/env ruby
target = "ext/nokogiri"
command = "ack -C3 --group --nocolor --ignore-file=match:xml_syntax_error*"
expressions = %w[
Nokogiri_error_raise
rb_exc_raise
rb_raise
]
expressions.each do |expression|
puts "## places where `#{expression}` is used"
puts
cmd = "#{command} #{expression} #{target}"
output = `#{cmd}`
output.split("\n").each do |line|
if line =~ /\.[ch]$/
puts "### #{line}"
puts
puts "```C"
elsif line.empty?
puts "```"
puts
elsif line =~ /^--$/
puts "```"
puts
puts "```C"
else
puts line
end
end
puts "```"
puts
end |
Note that TruffleRuby would also prefer if we didn't raise from a native code callback, see #1882 |
Recent relevant post from @peterzhu2118 with explanatory notes: https://blog.peterzhu.ca/ruby-c-ext-part-8/ Note also some exploratory work I've started at #2096 |
in xml_xpath_context.c:evaluate generic error handling Related to #1610
**What problem is this PR intended to solve?** I think the stack trace changed with ruby/ruby@51bd8165, see for example https://github.com/sparklemotion/nokogiri/actions/runs/9935752042/job/27442553528 I'm not sure why this is the only leak showing up now, but am deferring further study until I devote some time to cleaning up known leaks where we raise exceptions in Ruby callbacks (from libxml2), see #2096 and #1610
Memory leak is observed while searching in a document with undefined namespace prefix.
Reproduced with:
The text was updated successfully, but these errors were encountered: