-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WORK IN PROGRESS] - preparsing job recognises uploaded archive better #492
base: master
Are you sure you want to change the base?
Conversation
end | ||
|
||
# now that they are unzipped, check if they're actually proper files | ||
file_is_ok = false | ||
fh = File.open(genotype.genotype.path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did this ever work?? It parses the original uploaded file the way this looks like, that can be anything before extraction
app/workers/preparsing.rb
Outdated
end | ||
|
||
# now that they are unzipped, check if they're actually proper files | ||
file_is_ok = false | ||
fh = File.open(genotype.genotype.path) | ||
fh = File.open "#{Rails.root}/public/data/#{genotype.fs_filename}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Rails.root.join('path', 'to') instead.
app/workers/preparsing.rb
Outdated
rescue | ||
logger.info "nothing to unzip, seems to be a text-file in the first place" | ||
else | ||
system("cp #{genotype.genotype.path} #{Rails.root}/public/data/#{genotype.fs_filename}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Rails.root.join('path', 'to') instead.
app/workers/preparsing.rb
Outdated
is_collection = true | ||
end | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra blank line detected.
app/workers/preparsing.rb
Outdated
is_collection = true | ||
when /Zip archive data/ | ||
logger.info "File is zip" | ||
reader = Zip::File.method("open") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer single-quoted strings when you don't need string interpolation or special symbols.
app/workers/preparsing.rb
Outdated
reader = Gem::Package::TarReader.method("new") | ||
is_collection = true | ||
when /Zip archive data/ | ||
logger.info "File is zip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer single-quoted strings when you don't need string interpolation or special symbols.
app/workers/preparsing.rb
Outdated
is_collection = false | ||
when /gzip compressed data, was/ | ||
reader = Zlib::GzipReader.method("open") | ||
logger.info "File is gz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer single-quoted strings when you don't need string interpolation or special symbols.
app/workers/preparsing.rb
Outdated
reader = File.method("open") | ||
is_collection = false | ||
when /gzip compressed data, was/ | ||
reader = Zlib::GzipReader.method("open") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer single-quoted strings when you don't need string interpolation or special symbols.
app/workers/preparsing.rb
Outdated
case filetype | ||
when /ASCII text/ | ||
logger.info "File is flat text" | ||
reader = File.method("open") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer single-quoted strings when you don't need string interpolation or special symbols.
app/workers/preparsing.rb
Outdated
filetype = %x{file #{genotype.genotype.path}} | ||
case filetype | ||
when /ASCII text/ | ||
logger.info "File is flat text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer single-quoted strings when you don't need string interpolation or special symbols.
app/workers/preparsing.rb
Outdated
# | ||
# There are two possible outcomes - file is a collection of files (tar, tar.gz, zip) | ||
# or file is a single file (ASCII, gz) | ||
filetype = %x{file #{genotype.genotype.path}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use backticks around command string.
app/workers/preparsing.rb
Outdated
rescue | ||
logger.info "nothing to unzip, seems to be a text-file in the first place" | ||
else | ||
system("cp #{genotype.genotype.path} #{Rails.root.join('public', 'data', genotype.fs_filename)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Rails.root.join('path', 'to') instead.
Line is too long. [103/100]
app/workers/preparsing.rb
Outdated
system("mv #{Rails.root}/tmp/#{genotype.fs_filename}.csv #{Rails.root}/public/data/#{genotype.fs_filename}") | ||
logger.info "copied file" | ||
zipfile.extract(biggest, Rails.root.join('tmp', "#{genotype.fs_filename}.csv")) | ||
system("mv #{Rails.root.join('tmp', "#{genotype.fs_filename}.csv")} #{Rails.root.join('public', 'data',genotype.fs_filename)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Rails.root.join('path', 'to') instead.
Line is too long. [135/100]
Space missing after comma.
app/workers/preparsing.rb
Outdated
logger.info 'file is gz' | ||
is_collection = false | ||
when /gzip compressed data, last modified/ | ||
reader = -> (zipfile) { Gem::Package::TarReader.new(Zlib::GzipReader.open(zipfile)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use spaces between -> and opening brace in lambda literals
rescue | ||
logger.info "nothing to unzip, seems to be a text-file in the first place" | ||
else | ||
system("cp #{genotype.genotype.path} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use Rails.root.join('path', 'to') instead.
logger.info "copied file" | ||
zipfile.extract(biggest, Rails.root.join('tmp', "#{genotype.fs_filename}.csv")) | ||
system("mv #{Rails.root.join('tmp', "#{genotype.fs_filename}.csv")} \ | ||
#{Rails.root.join('public', 'data',genotype.fs_filename)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space missing after comma.
logger.info 'file is gz' | ||
is_collection = false | ||
when /gzip compressed data, last modified/ | ||
reader = ->(zipfile){ Gem::Package::TarReader.new(Zlib::GzipReader.open(zipfile)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space missing to the left of {.
# | ||
# There are two possible outcomes - file is a collection of files (tar, tar.gz, zip) | ||
# or file is a single file (ASCII, gz) | ||
filetype = `file #{genotype.genotype.path}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw some .docx
files as well.. not sure if after extraction or before..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, yeah, some folks aren't great at uploading the right file types. I'd say we should reject anything that's not ASCII post-unzip :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. I also saw a few post-genotype analyses from Promethease.
use bash tools to check file type
use the right gem to extract from the archive
make sure everything downstream still works, needs more testing from me
Add more test files in different formats