-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for PDF/A-1b #1029
base: master
Are you sure you want to change the base?
Add support for PDF/A-1b #1029
Conversation
@pointlessone can you have a look at my pull request? Any thoughts about what to change to get this feature integrated? |
.gitignore
Outdated
@@ -10,3 +10,4 @@ drop_to_console.rb | |||
/bin | |||
.DS_Store | |||
/.byebug_history |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go into your global .gitignore
. This is not a Prawn dep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/prawn/document.rb
Outdated
@@ -66,7 +66,8 @@ class Document | |||
:page_size, :page_layout, :margin, :left_margin, | |||
:right_margin, :top_margin, :bottom_margin, :skip_page_creation, | |||
:compress, :background, :info, | |||
:text_formatter, :print_scaling | |||
:text_formatter, :print_scaling, | |||
:trailer, :enable_pdfa_1b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather make the ID deterministic so that we didn't have to make trailer
accessible here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/prawn/vera_pdf.rb
Outdated
require 'open3' | ||
|
||
module Prawn | ||
module VeraPdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only used in specs, so it should live in specs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to comment on each my comment as long as you push granular commits. GitHub hides comments on the code that has been changed and lets reviewing only new changes. It also sends emails about new commits (not about force pushes, unfortunately, so please let me know about those if you want someone to look at those).
Just a hint to save a few minutes for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am more used to GitLab which lets you 'resolve' a discussion manually. I use my 'dones' here primarily for keeping track of what I still need to do. For example your comment about making sure that the CI runs the veraPDF specs is now hidden as 'outdated' because I moved the file elsewhere. As long as you are not annoyed with getting emails for every 'done' I would continue with this practice. Or is there some GitHub trick for such things? 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is there some GitHub trick for such things?
Not that I know of. Would love to have a manual option like you described.
As long as you are not annoyed with getting emails for every 'done' I would continue with this practice.
Not at all. It absolutely makes sense since I left quite a few comments and it's hard to keep track otherwise.
|
||
include Prawn::VeraPdf | ||
|
||
if vera_pdf_available? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to let developers know whats's wrong. But please make sure CI has all tools installed to actually run the specs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
spec/prawn/document_spec.rb
Outdated
pdf = described_class.new | ||
pdf.text 'James' | ||
output = StringIO.new(pdf.render) | ||
hash = PDF::Reader::ObjectHash.new(output) | ||
|
||
streams = hash.values.select { |obj| obj.is_a?(PDF::Reader::Stream) } | ||
|
||
expect(streams.size).to eq(1) | ||
expect(streams.size).to eq(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
spec/prawn/document_spec.rb
Outdated
@@ -530,7 +536,7 @@ def self.format(string) | |||
|
|||
streams = hash.values.select { |obj| obj.is_a?(PDF::Reader::Stream) } | |||
|
|||
expect(streams.size).to eq(1) | |||
expect(streams.size).to eq(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't change either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
spec/prawn/document_spec.rb
Outdated
# We need to overwrite the trailer ID, otherwise each render | ||
# pass will generate a new random ID and the documents would | ||
# not match. | ||
trailer_id = PDF::Core::ByteString.new(SecureRandom.random_bytes(16)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure this doesn't happen without any effort on the users part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
spec/prawn/stamp_spec.rb
Outdated
@@ -95,7 +95,7 @@ | |||
next unless obj =~ %r{/Type /Page$} | |||
# The page object must contain the annotation reference | |||
# to render a clickable link | |||
expect(obj).to match(%r{^/Annots \[\d \d .\]$}) | |||
expect(obj).to match(%r{^/Annots \[\d+ \d .\]$}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the additional object for the XMP metadata stream the object number for the annotation object switched from single digit to double digit (from 9 to 10). This regex only tested for single digit object number. If we make the XMP metadata stream optional, this change can be reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted.
prawn.gemspec
Outdated
@@ -46,6 +46,7 @@ Gem::Specification.new do |spec| | |||
spec.add_development_dependency('pdf-reader', ['~> 1.4', '>= 1.4.1']) | |||
spec.add_development_dependency('rubocop', '~> 0.47.1') | |||
spec.add_development_dependency('rubocop-rspec', '~> 1.10') | |||
spec.add_development_dependency('nokogiri', '~> 1.7') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please no binary dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will look into replacing the veraPDF report parsing with REXML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/prawn/document.rb
Outdated
@@ -384,7 +384,7 @@ def render(*a, &b) | |||
# pdf.render_file "foo.pdf" | |||
# | |||
def render_file(filename) | |||
File.open(filename, 'wb') { |f| render(f) } | |||
File.open(filename, 'rb+') { |f| render(f) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A file opened with 'wb' does not allow reading. My seek-solution reads the rendered body to avoid a second render pass. As you already pointed pointed out the flaw with seeking, I will revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted.
spec/prawn/vera_pdf.rb
Outdated
@@ -1,4 +1,4 @@ | |||
require 'nokogiri' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have spec/extensions
dir for things like this. It probably should be renamed to helpers
but that's how it is. For historical reasons.
Please put this file in that dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. It is better located in extensions
. No need to require the veraPDF helpers any more. 👍
@pointlessone I think I have addressed your comments so far. How do you want to deal with rubocop messages? My own projects usually have a policy of only merging when all rubocop messages have been cleared. But I see that rubocop is also complaining about files I have not touched with my own pull request. |
This pull request adds support for PDF/A-1b compliant documents. The need for such a feature was already stated a few years ago (Google Groups).
This pull request needs prawnpdf/pdf-core#34
Those features were missing for PDF/A-1b and were added:
I used veraPDF to validate for PDF/A-1b conformance.