Hubert Hirtz
f0812d7848
[utils] Handle user:pass in URLs ( #28801 )
...
* Handle user:pass in URLs
Fixes "nonnumeric port" errors when youtube-dl is given URLs with
usernames and passwords such as:
http://username:password@example.com/myvideo.mp4
Refs:
- https://en.wikipedia.org/wiki/Basic_access_authentication
- https://tools.ietf.org/html/rfc1738#section-3.1
- https://docs.python.org/3.8/library/urllib.parse.html#urllib.parse.urlsplit
Fixes #18276 (point 4)
Fixes #20258
Fixes #26211 (see comment)
* Align code with yt-dlp
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2024-03-04 01:27:55 +00:00
dirkf
427472351c
[utils] Make restricted filenames ignore characters in Unicode categories Mark, Other
...
Resolves #32629
2023-11-29 22:08:01 +00:00
dirkf
66ab0814c4
[utils] Revert bbd3e7e
, updating docstring, test instead
2023-09-03 23:15:19 +01:00
dirkf
bbd3e7e999
[utils] Properly handle list values in update_url()
...
An actual list value in a query update could have been treated
as a list of values because of the key:list parse_qs format.
2023-09-03 01:18:22 +01:00
dirkf
2efc8de4d2
[utils] Advertise optional supported Content-Encoding
s
2023-08-01 01:05:09 +01:00
dirkf
e4178b5af3
[utils] Add and use filter_dict()
from yt-dlp
2023-08-01 01:05:09 +01:00
dirkf
2d2a4bc832
[utils] Revise isinstance()
tests (especially for str/unicode/bytes) to complete Linter fix
2023-08-01 01:05:09 +01:00
dirkf
7d965e6b65
[utils] Avoid comparing type(var)
, etc, to pass new Linter rules
2023-08-01 01:05:09 +01:00
dirkf
abef53466d
[utils] Rework URL path munging for ., .. components
...
* move processing to YoutubeDLHandler
* also process `Location` header for redirect
* use tests from https://github.com/yt-dlp/yt-dlp/pull/7662
2023-07-29 14:27:26 +01:00
dirkf
e7926ae9f4
[utils] Rework decoding of Content-Encoding
s
...
* support nested encodings
* support optional `br` encoding, if brotli package is installed
* support optional 'compress' encoding, if ncompress package is installed
* response `Content-Encoding` has only unprocessed encodings, or removed
* response `Content-Length` is decoded length (usable for filesize metadata)
* use zlib for both deflate and gzip decompression
* some elements taken from yt-dlp: thx especially coletdjnz
2023-07-29 14:27:26 +01:00
dirkf
2b7dd3b2a2
[utils] Fix update_Request() with empty data (not None)
2023-07-25 13:19:43 +01:00
dirkf
1fa8b86f0b
[utils] Remove stray undocumented Host header in redirect (fix 46fde7c
)
2023-07-20 05:29:59 +01:00
dirkf
a190b55964
[utils] Fix broken Py 3.11+ compat in traverse_obj()
...
* inspect.getargspec is missing despite doc claiming backward compat
* replace with emulation of `Signature.bind()`
2023-07-19 22:14:50 +01:00
dirkf
cb9366eda5
[utils] Minor updates (merge_dicts, T)
...
A couple of mods to ease yt-dlp back-ports:
* add kwargs to merge_dicts:
`unblank=True` (disallow empty string), `rev=False` (reverse the merge list)
* add `T(x)` shortcut for `{x}`, unsupported in Py2.6
2023-07-19 22:14:50 +01:00
dirkf
d9d07a9581
[utils] Improve js_to_json, align with yt-dlp
...
* support variable substitution, from https://github.com/yt-dlp/yt-dlp/pull/#521 etc,
thanks ChillingPepper, Grub4k, pukkandan
* improve escape handling, from https://github.com/yt-dlp/yt-dlp/pull/#521
thanks Grub4k
* support template strings from https://github.com/yt-dlp/yt-dlp/pull/6623
thanks Grub4k
* add limited `!` evaluation (eg, !!0 -> false, see tests)
2023-07-19 22:14:50 +01:00
dirkf
825a40744b
[utils] Align traverse_obj() with yt-dlp
...
Thanks Grub4k for these:
* traverse `Iterable`s, from https://github.com/yt-dlp/yt-dlp/pull/6902 , etc
* traverse `set` key for transformations/filters, `re.Match` group names, from
776995bc10
, etc
* traverse `re.Match`es, from https://github.com/yt-dlp/yt-dlp/pull/5174
* always return list when branching, from https://github.com/yt-dlp/yt-dlp/pull/5170
2023-07-19 22:14:50 +01:00
dirkf
1d8d5a93f7
[test] Fixes for old Pythons
2023-07-18 10:50:46 +01:00
bashonly
3801d36416
[utils] YoutubeDLCookieJar
: Add get_cookie_header
and get_cookies_for_url
methods
2023-07-18 10:50:46 +01:00
dirkf
b383be9887
[core] Remove Cookie
header on redirect to prevent leaks
...
Adated from yt-dlp/yt-dlp-ghsa-v8mc-9377-rwjj/pull/1/commits/101caac
Thx coletdjnz
2023-07-18 10:50:46 +01:00
dirkf
46fde7caee
[core] Update redirect handling from yt-dlp
...
* Thx coletdjnz: https://github.com/yt-dlp/yt-dlp/pull/7094
* add test that redirected `POST` loses its `Content-Type`
2023-07-18 10:50:46 +01:00
dirkf
f47fdb9564
[utils] Add {expected_type} and Iterable support to traverse_obj()
2023-07-18 10:50:46 +01:00
dirkf
f24bc9272e
[Misc] Fixes for 2.6 compatibility
2023-07-05 22:58:54 +01:00
dirkf
11cc3f3ad0
[utils] Fix compiled_regex_type
in 249f2b6
2023-05-11 20:53:07 +01:00
dirkf
64d6dd64c8
[YouTube] Support Releases tab
2023-04-23 22:58:35 +01:00
dirkf
25124bd640
[devscripts] Improve hack to convert command-line options to API options
...
* define equality for DateRange
* don't show default DateRange
2023-04-05 19:05:16 +01:00
dirkf
f35b757c82
[utils] Ensure allow_types
for variadic()
is a tuple
2023-03-19 02:29:00 +00:00
pukkandan
1d3751c3fe
Escape URLs in sanitized_Request
, not sanitize_url
d2558234cf5dd12d6896eed5427b7dcdb3ab7b5a added escaping of URLs while sanitizing. However, sanitize_url
may not always receive an actual URL. Eg: When using youtube-dl "search query" --default-search ytsearch
, search query
gets escaped to search%20query
before being prefixed with ytsearch:
which is not the intended behavior. So the escaping is moved to sanitized_Request
instead.
2023-02-20 20:27:25 +00:00
dirkf
90c9f789d9
[utils] Add parse_qs, update_url
...
[skip ci]
2023-02-13 03:54:51 +00:00
dirkf
58988c1421
[YouTube] Bypass age-gating for certain restricted videos
...
* Use TVHTML5_SIMPLY_EMBEDDED_PLAYER client
* Also add and fix tests
* Introduce and use new utility function `update_url()`
2023-02-13 03:54:51 +00:00
Andrei Lebedev
27ed77aabb
[utils] Backport traverse_obj (etc) from yt-dlp ( #31156 )
...
* Backport traverse_obj and closely related function from yt-dlp (code by pukkandan)
* Backport LazyList, variadic(), try_call (code by pukkandan)
* Recast using yt-dlp's newer traverse_obj() implementation and tests (code by grub4k)
* Add tests for Unicode case folding support matching Py3.5+ (requires f102e3d
)
* Improve/add tests for variadic, try_call, join_nonempty
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-11-03 10:09:37 +00:00
dirkf
c94a459a24
[utils] Sanitize look-alike Unicode glyphs in non-ID filename fields when --restrict-filenames
...
Implements https://github.com/ytdl-org/youtube-dl/issues/31216#issuecomment-1236102822 , which has a test.
2022-10-11 12:18:12 +00:00
dirkf
556862bc91
[utils] Ensure RFC3986 encoding result is unicode
2022-08-21 00:45:06 +01:00
dirkf
d231b56717
[jsinterp] Overhaul JSInterp to handle new YT players 4c3f79c5, 324f67b9 ( #31170 )
...
* back-port from yt-dlp 8f53dc44a0cc1c2d98c35740b9293462c080f5d0, thanks pukkandan
* also support void, improve <</>> precedence, improve expressions in comma-list
* add more tests
2022-08-14 18:45:45 +01:00
pukkandan
0700fde640
[utils, etc] Kill child processes when yt-dl is killed
...
* derived from PR #26592 , closes #26592
Authored by: Unrud
2022-06-10 19:57:46 +01:00
pukkandan
1baa0f5f66
[utils] Escape URL while sanitizing
...
Closes #31008 , #yt-dlp/263
While this fixes the issue in question, it does not try to address the root-cause of the problem
Refer: 915f911e365736227e134ad654601443dbfd7ccb, f5fa042c82300218a2d07b95dd6b9c0756745db3
2022-06-06 16:03:04 +01:00
dirkf
52c3751df7
[utils] Enable ALPN in HTTPS to satisfy broken servers
...
See https://github.com/yt-dlp/yt-dlp/issues/3878
2022-05-28 13:52:51 +01:00
Sergey M․
cfee2dfe83
[utils] PEP 8
2021-04-17 03:32:04 +07:00
Sergey M․
a00a7e0cad
[utils] Add support for support for experimental HTTP response status code 308 Permanent Redirect (refs #27877 , refs #28768 )
2021-04-17 03:22:13 +07:00
Remita Amine
e88c9ef62a
[utils] add a function to clean podcast URLs
2021-01-04 01:14:25 +01:00
Remita Amine
9dd674e1d2
[utils] accept only supported protocols in url_or_none
2020-12-30 09:22:30 +01:00
Josh Soref
71ddc222ad
Fix typos ( #27084 )
...
* spelling: authorization
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: brightcove
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: creation
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: exceeded
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: exception
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: extension
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: extracting
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: extraction
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: frontline
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: improve
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: length
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: listsubtitles
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: multimedia
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: obfuscated
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: partitioning
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: playlist
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: playlists
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: restriction
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: services
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: split
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: srmediathek
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: support
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: thumbnail
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: verification
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* spelling: whitespaces
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-11-21 22:00:05 +07:00
Sergey M․
fe07e788bf
[utils] Skip ! prefixed code in js_to_json
2020-11-17 01:30:43 +07:00
Kevin O'Connor
4eda10499e
[utils] Don't attempt to coerce JS strings to numbers in js_to_json ( #26851 )
...
The current logic in `js_to_json` tries to rewrite octal/hex numbers to
decimal. However, when the logic actually happens the `"` or `'` have
already been trimmed off. This causes what were originally strings, that
happen to look like octal/hex numbers, to get rewritten to decimal and
returned as a number rather than a string.
In practive something like:
```js
{
"0x40": "foo",
"040": "bar",
}
```
would get rewritten as:
```json
{
64: "foo",
32: "bar
}
```
This is problematic since this isn't valid JSON as you cannot have
non-string keys.
2020-10-18 00:10:41 +07:00
Sergey M․
1d9bf655e6
[utils] Recognize wav mimetype ( closes #26463 )
2020-09-06 11:19:53 +07:00
Rob
9cd5f54e31
[utils] Fix file permissions in write_json_file ( closes #12471 ) ( #25122 )
2020-05-20 03:21:52 +07:00
Sergey M․
c380cc28c4
[utils] Improve cookie files support
...
+ Add support for UTF-8 in cookie files
* Skip malformed cookie file entries instead of crashing (invalid entry len, invalid expires at)
2020-05-05 04:21:25 +07:00
Sergey M․
f1a8511f7b
[utils] Add reference to cookie file format
2020-03-10 04:59:02 +07:00
Sergey M․
042b664933
Revert "[utils] Add support for cookies with spaces used instead of tabs"
...
According to [1] TABs must be used as separators between fields.
Files produces by some tools with spaces as separators are considered
malformed.
1. https://curl.haxx.se/docs/http-cookies.html
This reverts commit cff99c91d1
.
2020-03-10 04:53:51 +07:00
Sergey M․
cff99c91d1
[utils] Add support for cookies with spaces used instead of tabs
2020-03-08 18:01:32 +07:00
Sergey M․
fca6dba8b8
[YoutubeDL] Force redirect URL to unicode on python 2
2020-02-29 19:08:44 +07:00