-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.txt
284 lines (201 loc) · 8.5 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
NAME
s3up - s3 uploader proof-of-concept
SYNOPSIS
s3up [ <options> ] [ <globs> ]
DESCRIPTION
s3up is a proof-of-concept for uploading files to s3, taking advantage
of the features detailed on
Checking object integrity
https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html
while also calculating full-body checksums to ease validation using
local tools such as md5 or sha256sum.
s3up takes one or more options, detailed in the OPTIONS section below.
The only required option is -bucket if processing files, otherwise
the -key option is also required when processing a stream.
When processing files the -key option may be left unspecified, in which
case the filepath name will be used, set to a prefix (ending in a slash
("/")) in which case the prefix will be prepended to filepath names, or
set to non-prefix value in which case the filepath name will be
replaced with the specified -key name.
As its final arguments s3up takes one or more file glob patterns for
files to upload. A glob can be a full filename or valid glob pattern,
e.g., '*.pdf', to match against a list of files. Alternatively if no
globs are provided then s3up will read from the standard input stream,
in which case a non-prefix -key name is required.
OPTIONS
-h | -help | --help
Print out help and exit
-bucket string
Required name of the bucket to upload objects to.
-key string
If <globs> are specified then optionally set the name of the
object, or a prefix ending in '/' when uploading multiple
files. If no <globs> are specified then a non-prefix -key is
required.
-part-size value
Optionally specify the size of parts to upload.
(minimum: 5MiB, maximum: 5GiB, default: 5GiB)
-recursive
Optionally recursively process directories listed in <globs>
for files to upload.
-profile string
Optionally specify the AWS profile name to use.
-concurrent-objects int
Optionally specify the number of concurrent objects to upload
(default: 1)
-concurrent-parts int
Optionally specify the number of concurrent parts to upload per
object.
(default: 1)
-manifest value
Optionally specify a manifest type to produce on standard
output. Valid options are:
- json: produce full details about the object as JSON
- md5: MD5 checksum and <bucket>/<key>
- checksum: selected checksum and <bucket>/<key>
- aws: AWS hash-of-hashes checksum and <bucket>/<key>
- etag: AWS Object ETag and <bucket>/<key>
See MANIFESTS below for more details.
-media-types string
Optionally specify a path to a tab-separated-value file with
each line listing an extension and a media-type to use when
setting the content-type of an upload, e.g.,
.pdf application/pdf
.txt text/plain
Comments may be added by starting the line with '#', and these
will be ignored.
Any mappings loaded will either override any existing mapping
or will be added to the mappings.
-verbose
Optionally enable verbose logging to standard error.
-checksum string
Optionally specify the checksum algorithm to use, one of
SHA256, SHA1, CRC32, or CRC32C.
(default: SHA256)
-disable-path-style
Optionally disable use of older AWS S3 path-style requests (this
would be appropriate to set when copying to Amazon S3 instead of
to Elm).
-disable-s3-pool
Optionally disable use of multiple s3 clients (this would be
appropriate to set when copying to Amazon S3 instead of to Elm).
-max-part-id value
Optionally limit the number of parts to upload in a multi-part
object.
(default: 10000)
-use-temp-dir string
Optionally specify a directory to use for temporary files
created when buffering a stream.
-use-memory
Optionally specify that memory buffers should be used instead
of temporary files when buffering a stream.
-copy-buf string
Optionally specify the buffer size used to copy chunks
in-between readers and writers during processing.
(default: 256KiB)
-upload-part-timeout duration
Optionally set a timeout for any UploadPart requests, use
suffix "s" for seconds, "m" for minutes, "h" for hours, e.g.,
15m for 15 minutes.
(default: 0s, no timeout)
-complete-multipart-timeout duration
Optionally set a timeout for any CompleteMultipartUpload
requests, use suffix "s" for seconds, "m" for minutes, "h" for
hours, e.g., 15m for 15 minutes.
(default: 0s, no timeout)
-abort-multipart-timeout duration
Optionally set a timeout for any AbortMultipartUpload requests,
use suffix "s" for seconds, "m" for minutes, "h" for hours,
e.g., 15m for 15 minutes.
(default: 0s, no timeout)
-leave-parts-on-error
Optionally do not abort failed uploads, leaving parts on the
server for manual recovery.
MANIFESTS
Manifest types supported are:
- json: produce full details about the object in JSON format
- md5: MD5 checksum and <bucket>/<key>
- checksum: selected checksum and <bucket>/<key
- aws: AWS hash-of-hashes checksum and <bucket>/<key>
- etag: AWS Object ETag and <bucket>/<key>
With the exception of json the manifests take the form of
<value> <bucket>/<key>
Where <value> is a hex-encoded checksum (e.g., as produced by md5sum,
sha1sum, sha256sum), an ETag as produced by AWS, or a base64 encoded
hash-of-hashes as detailed in the AWS documentation section:
Using part-level checksums for multipart uploads
https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html#large-object-checksums
The "md5" and "checksum" formats mimics the format used to check
manifests produced by command line tools such as md5sum and sha1sum:
$ ./s3up --bucket test-jrobinso -manifest md5 *.dat
0386a9abe1d45fedae59fc3381506533 test-jrobinso/a-a-100MB.dat
10b41d719cc4a3e5e3228858ea84d533 test-jrobinso/a-a-200MB.dat
061fa237522bd2200ed14453ddfa6c86 test-jrobinso/a-a-500MB.dat
$ ./s3up --bucket test-jrobinso --checksum sha1 -manifest checksum *.dat
7bc7c147b691b55b4bc05ae3a40fa3bcc274e3fd test-jrobinso/a-a-100MB.dat
481eb555e10d651a84abf64c76e558deab947fae test-jrobinso/a-a-200MB.dat
5698313d0c7e27c16270c08fb250d544a14aa8b4 test-jrobinso/a-a-500MB.dat
When a json manifest is requested s3up produces a JSON array. Each
record in the array corresponds to an uploaded object and contains
metadata calculated by s3up followed by metadata fetched from the S3
server (the latter is the ObjectAttributes object). A sample record:
{
"Bucket": "test-jrobinso",
"Key": "500GB-in-large-files/a/y/a-y-500MB.dat",
"Completed": true,
"Aborted": false,
"FullChecksums": {
"ChecksumMD5": {
"Hex": "77faeaf43e9e70ec067f7927d3e53424",
"Base64": "d/rq9D6ecOwGf3kn0+U0JA=="
},
"ChecksumSHA256": {
"Hex": "a8c8f8906df45d5311bdcb7168541f960d8c8fcda6c12e144e7d1240405dc9cb",
"Base64": "qMj4kG30XVMRvctxaFQflg2Mj82mwS4UTn0SQEBdycs="
}
},
"ObjectChecksum": {
"ChecksumSHA256": {
"Hex": "a8c8f8906df45d5311bdcb7168541f960d8c8fcda6c12e144e7d1240405dc9cb",
"Base64": "qMj4kG30XVMRvctxaFQflg2Mj82mwS4UTn0SQEBdycs="
}
},
"ObjectAttributes": {
"LastModified": "2024-08-28T19:12:51Z",
"ETag": "77faeaf43e9e70ec067f7927d3e53424",
"Checksum": {
"ChecksumSHA256": {
"Hex": "a8c8f8906df45d5311bdcb7168541f960d8c8fcda6c12e144e7d1240405dc9cb",
"Base64": "qMj4kG30XVMRvctxaFQflg2Mj82mwS4UTn0SQEBdycs="
}
},
"ObjectParts": {
"IsTruncated": false,
"TotalPartsCount": 1,
"Parts": [
{
"PartNumber": 1,
"Size": 500000000,
"ChecksumMD5": {
"Hex": "77faeaf43e9e70ec067f7927d3e53424",
"Base64": "d/rq9D6ecOwGf3kn0+U0JA=="
}
}
]
}
}
}
If errors were encountered they will be listed in an additional Errors
field. The outline of an Errors field is:
"Errors": {
"PutObjectError": "<error>",
"UploadPartErrors": [
{
"PartNumber": 1,
"Error": "<error>"
}
],
"CompleteMultipartUploadError": "<error>",
"AbortMultipartUploadError": "<error>",
"GetObjectAttributesError": "<error>"
}