Skip to content
Tim Rühsen edited this page Oct 15, 2015 · 30 revisions

Wget2 Introduction

The development of Wget2 started and everybody is invited to contribute, test, discuss, etc.
The codebase is hosted in the 'wget2' branch of wget's git repository and on github - both will be regularly synced.

Wget2 on Savannah (checkout branch 'wget2' afte cloning)

Wget2 on Github

The idea is to have a fresh and maintainable codebase with features like multithreaded downloads, HTTP2, OCSP, HSTS, Metalink, IDNA2008, Public Suffix List, Multi-Proxies, Sitemaps, Atom/RSS Feeds, compression (gzip, deflate, lzma, bzip2), support for local filenames, etc.
Some of these feature have been built into Wget in the meantime, but some other are really hard to implement into the old codebase.

Most of the functionality is exposed via library API (libwget), to allow external programs make use of it. E.g. have a look at examples/print_css_urls.c - just a few lines of C to parse and print out all URLs from a CSS file.

Wget2 will stay as an own executable separate from Wget.
So you can install and test Wget2 without endangering your existing architecture and scripts.

What is missing

  • FTP(S) support
  • WARC support
  • Several Wget options are missing.
  • HTTP/2 pipelining is missing
  • API documentation

New options

--force-css         Treat input file as CSS. (default: off)
--force-sitemap     Treat input file as Sitemap. (default: off)
--force-atom        Treat input file as Atom Feed. (default: off)
--force-rss         Treat input file as RSS Feed. (default: off)
--num-threads       Max. concurrent download threads. (default: 5)
--gnutls-options    Custom GnuTLS priority string. Interferes with --secure-protocol. (default: none)
--ocsp-stapling     Use OCSP stapling to verify the server's certificate. (default: on)
--ocsp              Use OCSP server access to verify server's certificate. (default: on)
--ocsp-file         Set file for OCSP chaching. (default: .wget_ocsp)
--http2             Use HTTP/2 protocol if possible. (default: on)
--input-encoding    Character encoding of the file contents read with --input-file. (default: local encoding)
--cookie-suffixes   Load public suffixes from file. They prevent 'supercookie' vulnerabilities.
--chunk-size        Download large files in multithreaded chunks. (default: 0 (=off))
                    Example: wget --chunk-size=1M
  • new 'include' statement for config files, e.g. to load /etc/wget/conf.d/*.conf

  • --input-file - (reading URLs from stdin) starts downloading with the first URL to allow slow URL generators feed Wget2

  • check HTTP 'ETag' to avoid parsing doublettes

  • use HTTP 'Accept-Encoding': gzip, deflate, lzma, bzip2

  • CLI string options can be set to NULL by prepending a --no-, e.g. --no-user-agent

      Feature                   | Wget            | Wget2
      ------------------------- | --------------- | -----