GALLERY-DL.CONF(5) | gallery-dl Manual | GALLERY-DL.CONF(5) |
gallery-dl.conf - gallery-dl configuration file
gallery-dl will search for configuration files in the following places every time it is started, unless --ignore-config is specified:
/etc/gallery-dl.conf $HOME/.config/gallery-dl/config.json $HOME/.gallery-dl.conf
It is also possible to specify additional configuration files with the -c/--config command-line option or to add further option values with -o/--option as <key>=<value> pairs,
Configuration files are JSON-based and therefore don't allow any ordinary comments, but, since unused keys are simply ignored, it is possible to utilize those as makeshift comments by settings their values to arbitrary strings.
{
"{manga}_c{chapter}_{page:>03}.{extension}"
{ "extension == 'mp4'": "{id}_video.{extension}", "'nature' in title" : "{id}_{title}.{extension}", "" : "{id}_default.{extension}" }
If this is an object, it must contain Python expressions mapping to the filename format strings to use. These expressions are evaluated in the order as specified in Python 3.6+ and in an undetermined order in Python 3.4 and 3.5.
The available replacement keys depend on the extractor used. A list of keys for a specific one can be acquired by calling *gallery-dl* with the -K/--list-keywords command-line option. For example:
$ gallery-dl -K http://seiga.nicovideo.jp/seiga/im5977527 Keywords for directory names:
category seiga subcategory image
Keywords for filenames:
category seiga extension None image-id 5977527 subcategory image
Note: Even if the value of the extension key is missing or None, it will be filled in later when the file download is starting. This key is therefore always available to provide a valid filename extension.
["{category}", "{manga}", "c{chapter} - {title}"]
{ "'nature' in content": ["Nature Pictures"], "retweet_id != 0" : ["{category}", "{user[name]}", "Retweets"], "" : ["{category}", "{user[name]}"] }
If this is an object, it must contain Python expressions mapping to the list of format strings to use.
Each individual string in such a list represents a single path segment, which will be joined together and appended to the base-directory to form the complete target directory path.
If this is a string, add a parent's metadata to its
children's
to a field named after said string. For example with
"parent-metadata": "_p_":
{ "id": "child-id", "_p_": {"id": "parent-id"} }
Special values:
* "auto": Use characters from
"unix" or "windows" depending on the
local operating system
* "unix": "/"
* "windows": "\\\\|/<>:\"?*"
* "ascii": "^0-9A-Za-z_." (only ASCII
digits, letters, underscores, and dots)
* "ascii+": "^0-9@-[\\]-{ #-)+-.;=!}~"
(all ASCII characters except the ones not allowed by Windows)
Implementation Detail: For strings with length >= 2, this option uses a Regular Expression Character Set, meaning that:
* using a caret ^ as first character inverts the set
* character ranges are supported (0-9a-z)
* ], -, and \ need to be escaped as \\],
\\-, and \\\\ respectively to use them as literal
characters
Note: In a string with 2 or more characters, []^-\ need to be escaped with backslashes, e.g. "\\[\\]"
Special values:
* "auto": Use characters from
"unix" or "windows" depending on the
local operating system
* "unix": ""
* "windows": ". "
{ "jpeg": "jpg", "jpe" : "jpg", "jfif": "jpg", "jif" : "jpg", "jfi" : "jpg" }
* true: Skip downloads
* false: Overwrite already existing files
* "abort": Stop the current extractor run
* "abort:N": Skip downloads and stop the current
extractor run after N consecutive skips
* "terminate": Stop the current extractor
run, including parent extractors
* "terminate:N": Skip downloads and stop the current
extractor run, including parent extractors, after N consecutive
skips
* "exit": Exit the program altogether
* "exit:N": Skip downloads and exit the program after
N consecutive skips
* "enumerate": Add an enumeration index to the beginning of the filename extension (file.1.ext, file.2.ext, etc.)
This is supported for
* aibooru (*)
* ao3
* aryion
* atfbooru (*)
* bluesky
* booruvar (*)
* coomerparty
* danbooru (*)
* deviantart
* e621 (*)
* e6ai (*)
* e926 (*)
* exhentai
* horne (R)
* idolcomplex
* imgbb
* inkbunny
* kemonoparty
* koharu
* mangadex
* mangoxo
* newgrounds
* nijie (R)
* pillowfort
* sankaku
* seiga
* subscribestar
* tapas
* tsumino
* twitter
* vipergirls
* zerochan
These values can also be specified via the -u/--username and -p/--password command-line options or by using a .netrc file. (see Authentication_)
(*) The password value for these sites should be the API key found in your user profile, not the actual account password.
(R) Login with username & password or supplying logged-in cookies is required
Note: Leave the password value empty or undefined to be prompted for a passeword when performing a login (see getpass()).
* The Path to a Mozilla/Netscape format cookies.txt file
"~/.local/share/cookies-instagram-com.txt"
* An object specifying cookies as name-value pairs
{ "cookie-name": "cookie-value", "sessionid" : "14313336321%3AsabDFvuASDnlpb%3A31", "isAdult" : "1" }
* A list with up to 5 entries specifying a browser profile.
* The first entry is the browser name
* The optional second entry is a profile name or an absolute path to a
profile directory
* The optional third entry is the keyring to retrieve passwords for
decrypting cookies from
* The optional fourth entry is a (Firefox) container name
("none" for only cookies with no container (default))
* The optional fifth entry is the domain to extract cookies for. Prefix it
with a dot . to include cookies for subdomains.
["firefox"] ["firefox", null, null, "Personal"] ["chromium", "Private", "kwallet", null, ".twitter.com"]
* "random": Select cookies randomly
* "rotate": Select cookies in sequence. Start over from
the beginning after reaching the end of the list.
[ "~/.local/share/cookies-instagram-com-1.txt", "~/.local/share/cookies-instagram-com-2.txt", "~/.local/share/cookies-instagram-com-3.txt", ["firefox", null, null, "c1", ".instagram-com"], ]
* If this is a Path, write cookies to the given file path.
* If this is true and extractor.*.cookies specifies the Path of a valid cookies.txt file, update its contents.
"http://10.10.1.10:3128"
{ "http" : "http://10.10.1.10:3128", "https": "http://10.10.1.10:1080", "http://10.20.1.128": "http://10.10.1.10:5323" }
* If this is a string, it is the proxy URL for all
outgoing requests.
* If this is an object, it is a scheme-to-proxy mapping to specify
different proxy URLs for each scheme. It is also possible to set a proxy
for a specific host by using scheme://host as key. See
Requests' proxy documentation for more details.
Note: If a proxy URLs does not include a scheme, http:// is assumed.
Can be either a simple string with just the local IP
address
or a list with IP and explicit port number as elements.
Setting this value to "browser" will try to automatically detect and use the User-Agent used by the system's default browser.
Note: This option has no effect on pixiv, e621, and mangadex extractors, as these need specific values to function correctly.
Optionally, the operating system used in the User-Agent header can be specified after a : (windows, linux, or macos).
Note: requests and urllib3 only support HTTP/1.1, while a real browser would use HTTP/2.
If this is a string, send it as Referer instead of the extractor's root domain.
{ "User-Agent" : "<extractor.*.user-agent>", "Accept" : "*/*", "Accept-Language": "en-US,en;q=0.5", "Accept-Encoding": "gzip, deflate", "Referer" : "<extractor.*.referer>" }
To disable sending a header, set its value to null.
["ECDHE-ECDSA-AES128-GCM-SHA256", "ECDHE-RSA-AES128-GCM-SHA256", "ECDHE-ECDSA-CHACHA20-POLY1305", "ECDHE-RSA-CHACHA20-POLY1305"]
Can be disabled to alter TLS fingerprints and potentially bypass Cloudflare blocks.
For example, setting this option to "gdl_file_url" will cause a new metadata field with name gdl_file_url to appear, which contains the current file's download URL. This can then be used in filenames, with a metadata post processor, etc.
For example, setting this option to "gdl_path" would make it possible to access the current file's filename as "{gdl_path.filename}".
For example, setting this option to "gdl_http" would make it possible to access the current file's Last-Modified header as "{gdl_http[Last-Modified]}" and its parsed form as "{gdl_http[date]}".
The content of the object is as follows:
{ "version" : "string", "is_executable" : "bool", "current_git_head": "string or null" }
Each identifier can be
* A category or basecategory name ("imgur",
"mastodon")
* | A (base)category-subcategory pair, where both names are separated by a
colon ("redgifs:user"). Both names can be a * or left
empty, matching all possible names ("*:image",
":user").
Note: Any blacklist setting will automatically include "oauth", "recursive", and "test".
The resulting archive file is not a plain text file but an SQLite3 database, as either lookup operations are significantly faster or memory requirements are significantly lower when the amount of stored IDs gets reasonably large.
Note: Archive files that do not already exist get generated automatically.
Note: Archive paths support regular format string replacements, but be aware that using external inputs for building local paths may pose a security risk.
Available events are: file, skip
* "file": Write IDs immediately after
completing or skipping a file download.
* "memory": Keep IDs in memory and only write them after
successful job completion.
See <https://www.sqlite.org/pragma.html#toc> for available PRAGMA statements and further details.
{ "info:Logging in as .+" : "level = debug", "warning:(?i)unable to .+": "exit 127", "error" : [ "status = 1", "exec notify.sh 'gdl error'", "abort" ] }
[ ["info:Logging in as .+" , "level = debug"], ["warning:(?i)unable to .+", "exit 127" ], ["error" , [ "status = 1", "exec notify.sh 'gdl error'", "abort" ]] ]
pattern is parsed as severity level (debug, info, warning, error, or integer value) followed by an optional Python Regular Expression separated by a colon :. Using * as level or leaving it empty matches logging messages of all levels (e.g. *:<re> or :<re>).
action is parsed as action type followed by (optional) arguments.
It is possible to specify more than one action per pattern by providing them as a list: ["<action1>", "<action2>", …]
Supported Action Types:
status: Modify job exit status.
Expected syntax is <operator> <value> (e.g. =
100).
Supported operators are = (assignment), &
(bitwise AND), | (bitwise OR), ^ (bitwise XOR).
level: Modify severity level of the current logging message.
Can be one of debug, info, warning, error or
an integer value.
print: Write argument to stdout. exec: Run a shell command.
abort: Stop the current extractor run. terminate: Stop the
current extractor run, including parent extractors. restart:
Restart the current extractor run. wait: Sleep for a given
Duration or
wait until Enter is pressed when no argument was given.
exit: Exit the program with the given argument as exit status.
[ { "name": "zip" , "compression": "store" }, { "name": "exec", "command": ["/home/foobar/script", "{category}", "{image_id}"] } ]
Unlike other options, a postprocessors setting at a
deeper level
does not override any postprocessors setting at a lower level.
Instead, all post processors from all applicable postprocessors
settings get combined into a single list.
For example
* an mtime post processor at
extractor.postprocessors,
* a zip post processor at extractor.pixiv.postprocessors,
* and using --exec
will run all three post processors - mtime, zip, exec - for each downloaded pixiv file.
{ "archive": null, "keep-files": true }
2xx codes (success responses) and 3xx codes (redirection messages) will never be retried and always count as success, regardless of this option.
5xx codes (server error responses) will always be retried, regardless of this option.
This value gets internally used as the timeout parameter for the requests.request() method.
If this is a string, it must be the path to a CA bundle to use instead of the default certificates.
This value gets internally used as the verify parameter for the requests.request() method.
Setting this to false won't download any files, but all other functions (postprocessors, download archive, etc.) will be executed as normal.
These can be specified as
* index: 3 (file number 3)
* range: 2-4 (files 2, 3, and 4)
* slice: 3:8:2 (files 3, 5, and 7)
Arguments for range and slice notation are optional
and will default to begin (1) or end (sys.maxsize) if
omitted. For example 5-, 5:, and 5:: all mean
"Start at file number 5".
Note: The index of the first file is 1.
A file only gets downloaded when *all* of the given expressions evaluate to True.
Available values are the filename-specific ones listed by -K or -j.
See strptime for a list of formatting directives.
Note: Despite its name, this option does **not** control how {date} metadata fields are formatted. To use a different formatting for those values other than the default %Y-%m-%d %H:%M:%S, put strptime formatting directives after a colon :, for example {date:%Y%m%d}.
Special values:
* "all": Include HTTP request and response
headers. Hide Authorization, Cookie, and Set-Cookie
values.
* "ALL": Include all HTTP request and response
headers.
* true: Start on users' main gallery pages and
recursively descend into subfolders
* false: Get posts from "Latest Updates" pages
This value must be divisble by 16 and gets rounded down otherwise. The maximum possible value appears to be 1920.
Supported module types are image, video, mediacollection, embed, text.
https://developers.google.com/blogger/docs/3.0/using#APIKey
Possible values are "avatar", "background", "posts", "replies", "media", "likes",
It is possible to use "all" instead of listing all values separately.
* facets: hashtags, mentions, and
uris
* user: detailed user metadata for the user referenced in
the input URL (See app.bsky.actor.getProfile).
(See depth parameter of app.bsky.feed.getPostThread)
* true: Match URLs with *all* possible TLDs (e.g.
bunkr.xyz or bunkrrr.duck)
* false: Match only URLs with known TLDs
Available types are image, video, download, gallery.
* "rest": Public REST API
* "trpc": Internal TRPC API
See API/Authorization for details.
Available types are model, image, gallery.
* For "api": "rest", this can be one of "None", "Soft", "Mature", "X" to set the highest returned mature content flag.
* For "api": "trpc", this can be an integer whose bits select the returned mature content flags.
For example, 12 (48) would return only Mature and X rated images, while 3 (12) would return only None and Soft rated images,
Known available options include original, quality, width
Note: Set this option to an arbitrary letter, e.g., "w", to download images in JPEG format at their original resolution.
Setting this option to "auto" uses the same domain as a given input URL.
* true: Original ZIP archives
* false: Converted video files
It is possible to specify a custom list of metadata includes. See available_includes for possible field names. aibooru also supports ai_metadata.
Note: This requires 1 additional HTTP request per 200-post batch.
Note: Changing this setting is normally not necessary. When the value is greater than the per-page limit, gallery-dl will stop after the first batch. The value cannot be less than 1.
Setting an explicit filter ID overrides any default filters and can be used to access 18+ content without API Key.
See Filters for details.
Try to download the view_url version of these posts when this option is disabled.
Note: Enabling this option also enables deviantart.comments_.
Note: Enabling this option also enables deviantart.metadata_.
* true: Use a flat directory structure.
* false: Collect a list of all gallery-folders or
favorites-collections and transfer any further work to other extractors
(folder or collection), which will then create individual
subdirectories for each of them.
Note: Going through all gallery folders will not be able to fetch deviations which aren't in any folder.
Note: Gathering this information requires a lot of API calls. Use with caution.
When disabled, assume every given profile name belongs to a regular user.
Special values:
* "skip": Skip groups
Possible values are "avatar", "background", "gallery", "scraps", "journal", "favorite", "status".
It is possible to use "all" instead of listing all values separately.
* "html": HTML with (roughly) the same layout
as on DeviantArt.
* "text": Plain text with image references and HTML tags
removed.
* "none": Don't download textual content.
Note: No longer functional as of 2023-10-11
This option simply sets the mature_content parameter for API calls to either "true" or "false" and does not do any other form of content filtering.
Provides description, tags, license, and is_watching fields when enabled.
It is possible to request extended metadata by specifying a list of
* camera : EXIF information (if available)
* stats : deviation statistics
* submission : submission information
* collection : favourited folder information (requires a refresh
token)
* gallery : gallery folder information (requires a refresh
token)
Set this option to "all" to request all extended metadata categories.
See /deviation/metadata for official documentation.
Setting this option to "images" only downloads original files if they are images and falls back to preview versions for everything else (archives, videos, etc.).
* "api": Trust the API and stop when
has_more is false.
* "manual": Disregard has_more and only stop when
a batch of results is empty.
Set this option to "all" to download previews for all files.
Disable this option to *force* using a private token for all requests when a refresh token is provided.
Set this to "png" to download a PNG version of these images instead.
Using a refresh-token allows you to access private or otherwise not publicly available deviations.
Note: The refresh-token becomes invalid after 3 months or whenever your cache file is deleted or cleared.
Each format is parsed as SIZE.EXT.
Leave SIZE empty to download the regular, small avatar format.
Note: This requires 0-2 additional HTTP requests per post.
Note: Changing this setting is normally not necessary. When the value is greater than the per-page limit, gallery-dl will stop after the first batch. The value cannot be less than 1.
Note: Set this to "favdel" to remove galleries from your favorites.
Note: This will remove any Favorite Notes when applied to already favorited galleries.
* "resized": Continue downloading
non-original images.
* "stop": Stop the current extractor run.
* "wait": Wait for user input before retrying the current
image.
Adds archiver_key, posted, and torrents. Makes date and filesize more precise.
* "hitomi": Download the corresponding gallery from hitomi.la
* true: Extract embed URLs and download them if
supported (videos are not downloaded).
* "ytdl": Like true, but let ytdl handle
video extraction and download for YouTube, Vimeo, and SoundCloud embeds.
* false: Ignore embeds.
Note: This requires 1 additional API call per photo. See flickr.photos.getAllContexts for details.
Note: This requires 1 additional API call per photo. See flickr.photos.getExif for details.
It is possible to specify a custom list of metadata includes. See the extras parameter in Flickr's API docs for possible field names.
* If this is an integer, it specifies the maximum image
dimension (width and height) in pixels.
* If this is a string, it should be one of Flickr's format
specifiers ("Original", "Large", ...
or "o", "k", "h",
"l", ...) to use as an upper limit.
* "text": Plain text with HTML tags removed
* "html": Raw HTML content
Possible values are "gallery", "scraps", "favorite".
It is possible to use "all" instead of listing all values separately.
* "auto": Automatically differentiate between
"old" and "new"
* "old": Expect the *old* site layout
* "new": Expect the *new* site layout
* "asc": Ascending favorite date order
(oldest first)
* "desc": Descending favorite date order (newest first)
* "reverse": Same as "asc"
If not set, a temporary guest token will be used.
An invalid or not up-to-date value will result in 401 Unauthorized errors.
Keeping this option unset will use an extra HTTP request to attempt to fetch the current value used by gofile.
Possible values are "pictures", "scraps", "stories", "favorite".
It is possible to use "all" instead of listing all values separately.
Available formats are "webp" and "avif".
"original" will try to download the original jpg or png versions, but is most likely going to fail with 403 Forbidden errors.
These tokens allow using the API instead of having to scrape HTML pages, providing more detailed metadata. (date, description, etc)
See https://imgchest.com/docs/api/1.0/general/authorization for instructions on how to generate such a token.
* true: Follow Imgur's advice and choose MP4 if the
prefer_video flag in an image's metadata is set.
* false: Always choose GIF.
* "always": Always choose MP4.
(See API#Search for details)
* "rest": REST API - higher-resolution media
* "graphql": GraphQL API - lower-resolution media
* true: Start from the beginning. Log the most recent
cursor value when interrupted before reaching the end.
* false: Start from the beginning.
* any string: Start from the position defined by this value.
Possible values are "posts", "reels", "tagged", "stories", "highlights", "info", "avatar".
It is possible to use "all" instead of listing all values separately.
Note: This metadata is always available when referring to a user by name, e.g. instagram.com/USERNAME.
* "asc": Same order as displayed in a post
* "desc": Reverse order as displayed in a post
* "reverse": Same as "desc"
Note: This option does *not* affect {num}. To enumerate files in reverse order, use count - num + 1.
* "asc": Same order as displayed
* "desc": Reverse order as displayed
* "id" or "id_asc": Ascending order by
ID
* "id_desc": Descending order by ID
* "reverse": Same as "desc"
Note: This option only affects highlights.
Note: This requires 1 additional HTTP request per post.
* true: Download duplicates
* false: Ignore duplicates
Available types are artist, and post.
Available types are file, attachments, and inline.
Set this to "unique" to filter out duplicate revisions.
Note: This requires 1 additional HTTP request per post.
* "asc": Ascending order (oldest first)
* "desc": Descending order (newest first)
* "reverse": Same as "asc"
Use "all" to download all available formats, or a (comma-separated) list to select multiple formats.
If the selected format is not available, the first in the list gets chosen (usually mp3).
Disabling this option causes a gallery to be downloaded as individual image files.
When more than one format is given, the first available one is selected.
Possible formats are
"780", "980", "1280",
"1600", "0" (original)
Setting this option to "auto" uses the same domain as a given input URL.
Use true to download animated images as gifs and false to download as mp4 videos.
(See /manga/{id}/feed and /user/follows/manga/feed)
The general syntax is "<source name>:<ISO
639-1 language code>".
Both are optional, meaning "koala",
"koala:", ":en",
or even just ":" are possible as well.
Specifying the numeric ID of a source is also supported.
Note: gallery-dl comes with built-in tokens for mastodon.social, pawoo and baraag. For other instances, you need to obtain an access-token in order to use usernames in place of numerical user IDs.
Note: Not supported by all moebooru instances.
If the selected format is not available, the next smaller one gets chosen.
If this is a list, try each given filename extension in original resolution or recoded format until an available format is found.
Possible values are "art", "audio", "games", "movies".
It is possible to use "all" instead of listing all values separately.
Possible values are "illustration", "doujin", "favorite", "nuita".
It is possible to use "all" instead of listing all values separately.
* true: Download videos
* "ytdl": Download videos using ytdl
* false: Skip video Tweets
* true: Use Python's webbrowser.open() method to
automatically open the URL in the user's default browser.
* false: Ask the user to copy & paste an URL from the
terminal.
Note: All redirects will go to port 6414, regardless of the port specified here. You'll have to manually adjust the port number in your browser's address bar when using a different port than the default.
Note: This requires 1 additional HTTP request per post.
Available types are postfile, images, image_large, attachments, and content.
Setting this option to "auto" uses the same domain as a given input URL.
Possible values are "artworks", "avatar", "background", "favorite", "novel-user", "novel-bookmark".
It is possible to use "all" instead of listing all values separately.
Note: This requires 1 additional API call per bookmarked post.
* "japanese": List of Japanese tags
* "translated": List of translated tags
* "original": Unmodified list with both Japanese and translated
tags
These animations come as a .zip archive containing all animation frames in JPEG format by default.
Set this option to "original" to download them as individual, higher-quality frames.
Use an ugoira post processor to convert them to watchable animations. (Example__)
Use true to download animated images as gifs and false to download as mp4 videos.
* "stop: Stop the current extractor run.
* "wait: Ask the user to solve the CAPTCHA and wait.
"auto" uses the quality parameter of the input URL or "hq" if not present.
Reddit's internal default and maximum values for this parameter appear to be 200 and 500 respectively.
The value 0 ignores all comments and significantly reduces the time required when scanning a subreddit.
Note: This requires 1 additional API call for every 100 extra comments.
Special values:
* 0: Recursion is disabled
* -1: Infinite recursion (don't do this)
Using a refresh-token allows you to access private or otherwise not publicly available subreddits, given that your account is authorized to do so, but requests to the reddit API are going to be rate limited at 600 requests every 10 minutes/600 seconds.
* true: Download videos and use ytdl to handle
HLS and DASH manifests
* "ytdl": Download videos and let ytdl handle all
of video extraction and download
* "dash": Extract DASH manifest URLs and use ytdl
to download and merge them. (*)
* false: Ignore videos
(*) This saves 1 HTTP request per video and might potentially be able to download otherwise deleted videos, but it will not always get the best video quality available.
If a selected format is not available, the next one in the list will be tried until an available format is found.
If the format is given as string, it will be extended with ["hd", "sd", "gif"]. Use a list with one element to restrict it to only one possible format.
* "alphanumeric" or "alnum":
11-character alphanumeric IDs (y0abGlDOr2o)
* "numeric" or "legacy": numeric IDs
(360451)
* Grids: 460x215, 920x430, 600x900,
342x482, 660x930, 512x512, 1024x1024
* Heroes: 1920x620, 3840x1240, 1600x650
* Logos: N/A (will be ignored)
* Icons: 8x8, 10x10, 14x14, 16x16,
20x20, 24x24, 28x28, 32x32, 35x35,
40x40, 48x48, 54x54, 56x56, 57x57,
60x60, 64x64, 72x72, 76x76, 80x80,
90x90, 96x96, 100x100, 114x114,
120x120, 128x128, 144x144, 150x150,
152x152, 160x160, 180x180, 192x192,
194x194, 256x256, 310x310, 512x512,
768x768, 1024x1024
* Grids: png, jpeg, jpg, webp
* Heroes: png, jpeg, jpg, webp
* Logos: png, webp
* Icons: png, ico
* score_desc (Highest Score (Beta))
* score_asc (Lowest Score (Beta))
* score_old_desc (Highest Score (Old))
* score_old_asc (Lowest Score (Old))
* age_desc (Newest First)
* age_asc (Oldest First)
* Grids: alternate, blurred, no_logo,
material, white_logo
* Heroes: alternate, blurred, material
* Logos: official, white, black, custom
* Icons: official, custom
To generate a token, visit /user/USERNAME/list-tokens and click Create Token.
Allows skipping over posts without having to waste API calls.
For each photo with "maximum" resolution (width equal to 2048 or height equal to 3072) or each inline image, use an extra HTTP request to find the URL to its full-resolution version.
* "api": next parameter provided by
the API (potentially misses posts due to a bug in Tumblr's API)
* "before": timestamp of last post
* "offset": post offset number
* "abort": Raise an error and stop extraction
* "wait": Wait until rate limit reset
Possible types are text, quote, link, answer, video, audio, photo, chat.
It is possible to use "all" instead of listing all types separately.
Setting an explicit filter ID overrides any default filters and can be used to access 18+ content without API Key.
See Filters for details.
Try to download the view_url version of these posts when this option is disabled.
* false: Ignore cards
* true: Download image content from supported cards
* "ytdl": Additionally download video content from
unsupported cards using ytdl
Possible values are
* card names
* card domains
* <card name>:<card domain>
If this option is equal to "accessible", only download from conversation Tweets if the given initial Tweet is accessible.
* "auto": Always auto-generate a token.
* "cookies": Use token given by the ct0 cookie if
present.
* true: Start from the beginning. Log the most recent
cursor value when interrupted before reaching the end.
* false: Start from the beginning.
* any string: Start from the position defined by this value.
Note: A cursor value from one timeline cannot be used with another.
Going through a timeline with this option enabled is essentially the same as running gallery-dl https://twitter.com/i/web/status/<TweetID> with enabled conversations option for each Tweet in said timeline.
Note: This requires at least 1 additional API call per initial Tweet.
Possible values are "info", "avatar", "background", "timeline", "tweets", "media", "replies", "likes".
It is possible to use "all" instead of listing all values separately.
* "restid": /TweetResultByRestId -
accessible to guest users
* "detail": /TweetDetail - more stable
* "auto": "detail" when logged in,
"restid" otherwise
Known available sizes are 4096x4096, orig, large, medium, and small.
If this option is enabled, gallery-dl will try to fetch a quoted (original) Tweet when it sees the Tweet which quotes it.
* "abort": Raise an error and stop extraction
* "wait": Wait until rate limit reset
* "wait:N": Wait for N seconds
* "abort": Raise an error and stop extraction
* "wait": Wait until the account is unlocked and
retry
If this value is "self", only consider replies where reply and original Tweet are from the same user.
Note: Twitter will automatically expand conversations if you use the /with_replies timeline while logged in. For example, media from Tweets which the user replied to will also be downloaded.
It is possible to exclude unwanted Tweets using image-filter <extractor.*.image-filter_>.
If this value is "original", metadata for these files will be taken from the original Tweets, not the Retweets.
* "tweets": /tweets timeline + search
* "media": /media timeline + search
* "with_replies": /with_replies timeline + search
* "auto": "tweets" or
"media", depending on retweets and
text-tweets settings
This only has an effect with a metadata (or exec) post processor with "event": "post" and appropriate filename.
When not specified and asked for by Twitter, this identifier will need to entered in an interactive prompt.
Special values:
* "user":
https://twitter.com/i/user/{rest_id}
* "timeline":
https://twitter.com/id:{rest_id}/timeline
* "tweets":
https://twitter.com/id:{rest_id}/tweets
* "media":
https://twitter.com/id:{rest_id}/media
Note: To allow gallery-dl to follow custom URL formats, set the blacklist for twitter to a non-default value, e.g. an empty string "".
* true: Download videos
* "ytdl": Download videos using ytdl
* false: Skip video Tweets
Available formats are "raw", "full", "regular", "small", and "thumb".
For example "viper.click" if the main domain is blocked or to bypass Cloudflare,
Note: Requires login or cookies
Possible values are "avatar", "gallery", "spaces", "collection",
It is possible to use "all" instead of listing all values separately.
See https://wallhaven.cc/help/api for more information.
Possible values are "uploads", "collections".
It is possible to use "all" instead of listing all values separately.
Note: This requires 1 additional HTTP request per post.
Note: This requires 1 additional HTTP request per submission.
Set this to "video" to download GIFs as video files.
Possible values are "home", "feed", "videos", "newvideo", "article", "album".
It is possible to use "all" instead of listing all values separately.
If this value is "original", metadata for these files will be taken from the original posts, not the retweeted posts.
The value must be between 10 and 500.
See yt-dlp options / youtube-dl options
See yt-dlp format selection / youtube-dl format selection
Set this option to "force" for the same effect as --force-generic-extractor.
Note: Set quiet and no_warnings in extractor.ytdl.raw-options to true to suppress all output.
Setting this to null will try to import "yt_dlp" followed by "youtube_dl" as fallback.
{ "quiet": true, "writesubtitles": true, "merge_output_format": "mkv" }
Available options can be found in yt-dlp's docstrings / youtube-dl's docstrings
Note: This requires 1-2 additional HTTP requests per post.
* "api": Use the JSON API (no
extension metadata)
* "html": Parse HTML pages (limited to 100 pages * 24
posts)
Note: This requires 1 additional HTTP request per post.
Note: This requires 1 additional HTTP request per post.
When multiple names are given, download the first available one.
* true: Start with the latest chapter
* false: Start with the first chapter
Possible values are valid integer or floating-point numbers optionally followed by one of k, m. g, t, or p. These suffixes are case-insensitive.
* true: Write downloaded data into .part files
and rename them upon download completion. This mode additionally
supports resuming incomplete downloads.
* false: Do not use .part files and write data directly into
the actual output files.
Missing directories will be created as needed. If this value is null, .part files are going to be stored alongside the actual output files.
Set this option to null to disable this indicator.
Possible values are valid integer or floating-point numbers optionally followed by one of k, m. g, t, or p. These suffixes are case-insensitive.
Disable the use of a proxy for file downloads by explicitly setting this option to null.
For example, this will change the filename extension ({extension}) of a file called example.png from png to jpg when said file contains JPEG/JFIF data.
If the value is true, consume the response body. This avoids closing the connection and therefore improves connection reuse.
If the value is false, immediately close the connection without reading the response. This can be useful if the server is known to send large bodies for error responses.
Possible values are integer numbers optionally followed by one of k, m. g, t, or p. These suffixes are case-insensitive.
Codes 200, 206, and 416 (when resuming a partial download) will never be retried and always count as success, regardless of this option.
5xx codes (server error responses) will always be retried, regardless of this option.
Fail a download when a file does not pass instead of downloading a potentially broken file.
See yt-dlp options / youtube-dl options
See yt-dlp format selection / youtube-dl format selection
Note: Set quiet and no_warnings in downloader.ytdl.raw-options to true to suppress all output.
Setting this to null will try to import "yt_dlp" followed by "youtube_dl" as fallback.
See yt-dlp output template / youtube-dl output template.
Special values:
* null: generate filenames with
extractor.*.filename
* "default": use ytdl's default, currently
"%(title)s [%(id)s].%(ext)s" for yt-dlp /
"%(title)s-%(id)s.%(ext)s" for youtube-dl
Note: An output template other than null might cause unexpected results in combination with certain options (e.g. "skip": "enumerate")
{ "quiet": true, "writesubtitles": true, "merge_output_format": "mkv" }
Available options can be found in yt-dlp's docstrings / youtube-dl's docstrings
* "null": No output
* "pipe": Suitable for piping to other processes or files
* "terminal": Suitable for the standard Windows console
* "color": Suitable for terminals that understand ANSI
escape codes and colors
* "auto": "terminal" on Windows with
output.ansi disabled, "color" otherwise.
It is possible to use custom output format strings
by setting this option to an object and specifying start,
success, skip, progress, and
progress-total.
For example, the following will replicate the same output as mode: color:
{ "start" : "{}", "success": "\r\u001b[1;32m{}\u001b[0m\n", "skip" : "\u001b[2m{}\u001b[0m\n", "progress" : "\r{0:>7}B {1:>7}B/s ", "progress-total": "\r{3:>3}% {0:>7}B {1:>7}B/s " }
start, success, and skip are used to output the current filename, where {} or {0} is replaced with said filename. If a given format string contains printable characters other than that, their number needs to be specified as [<number>, <format string>] to get the correct results for output.shorten. For example
"start" : [12, "Downloading {}"]
progress and progress-total are used when
displaying the
download progress indicator, progress when the total number
of bytes to download is unknown,
progress-total otherwise.
For these format strings
* {0} is number of bytes downloaded
* {1} is number of downloaded bytes per second
* {2} is total number of bytes
* {3} is percent of bytes downloaded to total bytes
"utf-8"
{ "encoding": "utf-8", "errors": "replace", "line_buffering": true }
Possible options are
* encoding
* errors
* newline
* line_buffering
* write_through
When this option is specified as a simple string, it is interpreted as {"encoding": "<string-value>", "errors": "replace"}
Note: errors always defaults to "replace"
Set this option to "eaw" to also work with east-asian characters with a display width greater than 1.
{ "success": "1;32", "skip" : "2", "debug" : "0;37", "info" : "1;37", "warning": "1;33", "error" : "1;31" }
Output for mode: color
* success: successfully downloaded files
* skip: skipped files
Logging Messages:
* debug: debug logging messages
* info: info logging messages
* warning: warning logging messages
* error: error logging messages
* true: Show the default progress indicator
("[{current}/{total}] {url}")
* false: Do not show any progress indicator
* Any string: Show the progress indicator using this as a custom
format string. Possible replacement keys are current,
total and url.
If this is a simple string, it specifies the format string for logging messages.
The default format string here is "{message}".
The default format string here is also "{message}".
When combined with -I/--input-file-comment or -x/--input-file-delete, this option will cause *all* input URLs from these files to be commented/deleted after processing them and not just successful ones.
{ "Pictures": ["jpg", "jpeg", "png", "gif", "bmp", "svg", "webp"], "Video" : ["flv", "ogv", "avi", "mp4", "mpg", "mpeg", "3gp", "mkv", "webm", "vob", "wmv"], "Music" : ["mp3", "aac", "flac", "ogg", "wma", "m4a", "wav"], "Archives": ["zip", "rar", "7z", "tar", "gz", "bz2"] }
Files with an extension not listed will be ignored and stored in their default location.
* "replace": Replace/Overwrite the old version with the new one
* "enumerate": Add an enumeration index to the filename of the new version like skip = "enumerate"
* "abort:N": Stop the current extractor run after N consecutive files compared as equal.
* "terminate:N": Stop the current extractor run, including parent extractors, after N consecutive files compared as equal.
* "exit:N": Exit the program after N consecutive files compared as equal.
archive-format, archive-prefix, and archive-pragma options, akin to extractor.*.archive-format, extractor.*.archive-prefix, and extractor.*.archive-pragma, are supported as well.
* If this is a string, it will be executed using the system's shell, e.g. /bin/sh. Any {} will be replaced with the full path of a file or target directory, depending on exec.event
* If this is a list, the first element specifies the program name and any further elements its arguments. Each element of this list is treated as a format string using the files' metadata as well as {_path}, {_directory}, and {_filename}.
See metadata.event for a list of available events.
See metadata.event for a list of available events.
"sha256:hash_sha,sha3_512:hash_sha3"
{ "hash_sha" : "sha256", "hash_sha3": "sha3_512" }
For a list of available hash algorithms, run
python -c "import hashlib; print('\n'.join(hashlib.algorithms_available))"
or see python/hashlib.
* If this is a string, it is parsed as a a comma-separated list of algorthm-fieldname pairs:
[<hash algorithm> ":"] <field name> ["," ...]
When <hash algorithm> is omitted, <field name> is used as algorithm name.
* If this is an object, it is a <field name> to <algorithm name> mapping for hash digests to compute.
* "json": write metadata using
json.dump()
* "jsonl": write metadata in JSON Lines
<https://jsonlines.org/> format
* "tags": write tags separated by newlines
* "custom": write the result of applying
metadata.content-format to a file's metadata dictionary
* "modify": add or modify metadata entries
* "delete": remove metadata entries
Using "-" as filename will write all output to stdout.
If this option is set, metadata.extension and metadata.extension-format will be ignored.
* false: current target location for file downloads
(base-directory + directory_)
* true: current base-directory location
* any Path: custom location
Note: metadata.extension is ignored if this option is set.
Available events are:
init After post processor initialization and before the first file download finalize On extractor shutdown, e.g. after all files were downloaded finalize-success On extractor shutdown when no error occurred finalize-error On extractor shutdown when at least one error occurred prepare Before a file download prepare-after Before a file download, but after building and checking file paths file When completing a file download, but before it gets moved to its target location after After a file got moved to its target location skip When skipping a file download post When starting to download all files of a post, e.g. a Tweet on Twitter or a post on Patreon. post-after After downloading all files of a post
Note: Missing or undefined fields will be silently ignored.
Note: Cannot be used with metadata.include.
["blocked", "watching", "status[creator][name]"]
{ "blocked" : "***", "watching" : "\fE 'yes' if watching else 'no'", "status[username]": "{status[creator][name]!l}" }
Note: Only applies for "mode": "custom".
See the ensure_ascii argument of json.dump() for further details.
Note: Only applies for "mode": "json" and "jsonl".
See the indent argument of json.dump() for further details.
Note: Only applies for "mode": "json".
See the separators argument of json.dump() for further details.
Note: Only applies for "mode": "json" and "jsonl".
See the sort_keys argument of json.dump() for further details.
Note: Only applies for "mode": "json" and "jsonl".
For example, use "a" to append to a file's content or "w" to truncate it.
See the mode argument of open() for further details.
See the encoding argument of open() for further details.
archive-format, archive-prefix, and archive-pragma options, akin to extractor.*.archive-format, extractor.*.archive-prefix, and extractor.*.archive-pragma, are supported as well.
Enabling this option will only have an effect *if* there is actual mtime metadata available, that is
* after a file download ("event":
"file" (default), "event":
"after")
* when running *after* an mtime post processes for the same
event
For example, a metadata post processor for "event": "post" will *not* be able to set its file's modification time unless an mtime post processor with "event": "post" runs *before* it.
See metadata.event for a list of available events.
This value must be either a UNIX timestamp or a datetime object.
Note: This option gets ignored if mtime.value is set.
The resulting value must be either a UNIX timestamp or a datetime object.
archive-format, archive-prefix, and archive-pragma options, akin to extractor.*.archive-format, extractor.*.archive-prefix, and extractor.*.archive-pragma, are supported as well.
See metadata.event for a list of available events.
This function is specified as <module>:<function name> and gets called with the current metadata dict as argument.
module is either an importable Python module name or the Path to a .py file,
When no value is given, extractor.*.filename is used.
When no value is given, extractor.*.filename is used.
Possible values are
* "concat" (inaccurate frame timecodes for
non-uniform frame delays)
* "image2" (accurate timecodes, requires nanosecond file
timestamps, i.e. no Windows or macOS)
* "mkvmerge" (accurate timecodes, only WebM or MKV, requires
mkvmerge)
* "archive" (store "original" frames in a .zip
archive)
"auto" will select mkvmerge if available and fall back to concat otherwise.
* true: Enable ffmpeg output
* false: Disable all ffmpeg output
* any string: Pass -hide_banner and -loglevel with
this value as argument to ffmpeg
* "auto": Automatically assign a fitting
frame rate based on delays between frames.
* "uniform": Like auto, but assign an explicit
frame rate only to Ugoira with uniform frame delays.
* any other string: Use this value as argument for -r.
* null or an empty string: Don't set an explicit frame
rate.
This option, when libx264/5 is used, automatically adds ["-vf", "crop=iw-mod(iw\\,2):ih-mod(ih\\,2)"] to the list of ffmpeg command-line arguments to reduce an odd width/height by 1 pixel and make them even.
If this is a string, use it as alternate filename for frame delay files.
Possible values are "store", "zip", "bzip2", "lzma".
Note: Relative paths are relative to the current download directory.
* "safe": Update the central directory file header each time a file is stored in a ZIP archive.
This greatly reduces the chance a ZIP archive gets corrupted in case the Python interpreter gets shut down unexpectedly (power outage, SIGKILL) but is also a lot slower.
Any file in a specified directory with a .py filename extension gets imported and searched for potential extractors, i.e. classes with a pattern attribute.
Note: null references internal extractors defined in extractor/__init__.py or by extractor.modules.
Set this option to null or an invalid path to disable this cache.
For example, setting this option to "#" would allow a replacement operation to be Rold#new# instead of the default Rold/new/
* choose a name
* select "installed app"
* set http://localhost:6414/ as "redirect uri"
* solve the "I'm not a robot" reCAPTCHA if needed
* click "create app"
* copy the client id (third line, under your application's
name and "installed app") and put it in your configuration
file as "client-id"
* use "Python:<application name>:v1.0 (by
/u/<username>)" as user-agent and replace
<application name> and <username> accordingly
(see Reddit's API access rules)
* clear your cache to delete any remaining access-token
entries. (gallery-dl --clear-cache reddit)
* get a refresh-token for the new client-id (gallery-dl
oauth:reddit)
* If given as string, it is parsed according to
date-format.
* If given as integer, it is interpreted as UTC timestamp.
* If given as a single float, it will be used as that
exact value.
* If given as a list with 2 floating-point numbers a &
b , it will be randomly chosen with uniform distribution such
that a <= N <= b. (see random.uniform())
* If given as a string, it can either represent a single
float value ("2.85") or a range
("1.5-3.0").
Simple tilde expansion and environment variable expansion is supported.
In Windows environments, backslashes ("\") can, in addition to forward slashes ("/"), be used as path separators. Because backslashes are JSON's escape character, they themselves have to be escaped. The path C:\path\to\file.ext has therefore to be written as "C:\\path\\to\\file.ext" if you want to use backslashes.
{ "format" : "{asctime} {name}: {message}", "format-date": "%H:%M:%S", "path" : "~/log.txt", "encoding" : "ascii" }
{ "level" : "debug", "format": { "debug" : "debug: {message}", "info" : "[{name}] {message}", "warning": "Warning: {message}", "error" : "ERROR: {message}" } }
* format
* General format string for logging messages or an object with
format strings for each loglevel.
In addition to the default LogRecord attributes, it is
also possible to access the current extractor, job,
path, and keywords objects and their attributes, for example
"{extractor.url}", "{path.filename}",
"{keywords.title}"
* Default: "[{name}][{levelname}] {message}"
* format-date
* Format string for {asctime} fields in logging messages (see
strftime() directives)
* Default: "%Y-%m-%d %H:%M:%S"
* level
* Minimum logging message level (one of "debug",
"info", "warning",
"error", "exception")
* Default: "info"
* path
* Path to the output file
* mode
* Mode in which the file is opened; use "w" to truncate
or "a" to append (see open())
* Default: "w"
* encoding
* File encoding
* Default: "utf-8"
Note: path, mode, and encoding are only applied when configuring logging output to a file.
{ "name": "mtime" }
{ "name" : "zip", "compression": "store", "extension" : "cbz", "filter" : "extension not in ('zip', 'rar')", "whitelist" : ["mangadex", "exhentai", "nhentai"] }
It is possible to set a "filter" expression similar to image-filter to only run a post-processor conditionally.
It is also possible set a "whitelist" or "blacklist" to only enable or disable a post-processor for the specified extractor categories.
The available post-processor types are
classify Categorize files by filename extension
compare Compare versions of the same file and replace/enumerate
them on mismatch
(requires downloader.*.part = true and
extractor.*.skip = false)
exec Execute external commands hash Compute file hash
digests metadata Write metadata to separate files mtime
Set file modification time according to its metadata python Call
Python functions rename Rename previously downloaded files
ugoira Convert Pixiv Ugoira to WebM using ffmpeg
zip Store files in a ZIP archive
https://github.com/mikf/gallery-dl/issues
Mike Fährmann <mike_faehrmann@web.de>
and https://github.com/mikf/gallery-dl/graphs/contributors
gallery-dl(1)
2024-09-28 | 1.27.5 |