dirkf
b2ba24bb02
[InfoExtractor] Add _match_valid_url()
class method and refactor
...
* API compatible with yt-dlp
* also support Sequence of patterns in _VALID_URL
* one place to compile _VALID_URL
* TODO: remove existing extractor shims
2023-07-19 22:14:50 +01:00
dirkf
b2741f2654
[InfoExtractor] Add search methods for Next/Nuxt.js from yt-dlp
...
* add _search_nextjs_data(), from https://github.com/yt-dlp/yt-dlp/pull/1386
thanks selfisekai
* add _search_nuxt_data(), from https://github.com/yt-dlp/yt-dlp/pull/1921 ,
thanks Lesmiscore, pukkandan
* add tests for the above
* also fix HTML5 type recognition and tests, from
222a230871
,
thanks Lesmiscore
* update extractors in PR using above, fix tests.
2023-07-19 22:14:50 +01:00
dirkf
8465222041
[Clipchamp] Add new extractor back-ported from yt-dlp
2023-07-19 22:14:50 +01:00
dirkf
4339910df3
[DLF] Add site extractors back-ported from yt-dlp
...
* from https://github.com/yt-dlp/yt-dlp/pull/6697 , thanks nick-cd
2023-07-19 22:14:50 +01:00
dirkf
eaaf4c6736
[Whyp] Add extractor back-ported from yt-dlp
...
* from https://github.com/yt-dlp/yt-dlp/pull/6803 , thanks CoryTibbettsDev
2023-07-19 22:14:50 +01:00
dirkf
4566e6e53e
[GlobalPlayer] Add site extractors back-ported from yt-dlp
...
* from https://github.com/yt-dlp/yt-dlp/pull/6903 , thanks garret1317
2023-07-19 22:14:50 +01:00
dirkf
1e8ccdd2eb
[InfoExtractor] Support groups in _search_regex()
, etc
2023-07-19 22:14:50 +01:00
dirkf
fa7f0effbe
[YouTube] Avoid crash in author extraction
2023-06-22 23:14:21 +01:00
pukkandan
9112e668a5
[YouTube] Improve nsig function name extraction
...
Fixes player b7910ca8, using `,` vs `;`
See https://github.com/ytdl-org/youtube-dl/issues/32292#issuecomment-1602231170
Co-authored-by: dirkf
2023-06-22 16:46:53 +01:00
dirkf
07af47960f
[YouTube] Improve fix for ae8ba2c
...
Thx: https://github.com/yt-dlp/yt-dlp/commit/01aba25
2023-06-18 00:52:18 +01:00
dirkf
ae8ba2c319
[YouTube] Fix KeyError QV
in signature extraction failed
...
* temporarily force missing global definition into sig JS
* improve test: thanks https://github.com/yt-dlp/yt-dlp/issues/7327#issuecomment-1595274615
* resolves #32314
2023-06-17 15:55:19 +01:00
dirkf
ee731f3d00
[ITV] Fix UA capitalisation in 384f632
2023-05-23 16:50:25 +01:00
dirkf
64d6dd64c8
[YouTube] Support Releases tab
2023-04-23 22:58:35 +01:00
dirkf
2da3fa04a6
[YouTube] Simplify signature patterns
2023-04-12 23:53:14 +01:00
pukkandan
3f6d2bd76f
[extractor/youtube] Bypass throttling for -f17
...
and related cleanup
Thanks @AudricV for the finding
Ref: yt-dlp/yt-dlp/commit/c9abebb
2023-03-19 02:29:00 +00:00
pukkandan
88f28f620b
[extractor/youtube] Construct fragment list lazily
...
Ref: yt-dlp/yt-dlp/commit/e389d17
See: yt-dlp/yt-dlp#6517
2023-03-19 02:29:00 +00:00
dirkf
6fece0a96b
[AENetworksBaseIE] Report missing show data instead of crash
2023-03-14 16:23:20 +00:00
pukkandan
3da17834a4
[Youtube] Construct dash formats with range
query
...
See yt-dlp/yt_dlp#6369
2023-03-03 15:02:15 +00:00
dirkf
f7ce98a21e
[YouTube] Support @owner format in uploader_id etc
...
* implement https://github.com/ytdl-org/youtube-dl/issues/31530#issuecomment-1435734719
* update affected tests
* misc clean-ups
2023-02-24 12:22:16 +00:00
pukkandan
1d3751c3fe
Escape URLs in sanitized_Request
, not sanitize_url
d2558234cf5dd12d6896eed5427b7dcdb3ab7b5a added escaping of URLs while sanitizing. However, sanitize_url
may not always receive an actual URL. Eg: When using youtube-dl "search query" --default-search ytsearch
, search query
gets escaped to search%20query
before being prefixed with ytsearch:
which is not the intended behavior. So the escaping is moved to sanitized_Request
instead.
2023-02-20 20:27:25 +00:00
df
6067451e43
[Vimeo] Fix e19ec52
for tween-age Pythons
...
* a check in older Pythons in the 2.7 and earlier, 3.3, 3.4 series caused "sre_constants.error: nothing to repeat"
* satisfy the check by avoiding nested qualifiers that can match empty string
Resolves #31597
2023-02-20 01:41:46 +00:00
dirkf
2dd6c6edd8
[YouTube] Avoid crash if uploader_id extraction fails
...
See #31530 .
2023-02-17 11:16:54 +00:00
dirkf
42b098dd79
[InfoExtractor] Handle unquoted values in OpenGraph searches
2023-02-14 02:53:16 +00:00
fonkap
6f8c2635a5
[StreamsbIE] Add extractor for streamsb.com (viewsb.com) ( #31517 )
...
* Add extractor for streamsb.com (viewsb.com)
* make data url using app.js version
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13 03:54:51 +00:00
fonkap
de48105dd8
[KommunetvIE] Add extractor for kommunetv.no ( #31516 )
...
* Add extractor for kommunetv.no
* Using utils.update_url instead of regex
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13 03:54:51 +00:00
fonkap
822f19f05d
[FileMoonIE] Add extractor for filemoon.sx ( #31515 )
...
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13 03:54:51 +00:00
Valentin Metz
f33923cba7
[rbgtum] Add new extractor ( #31305 )
...
* [rbgtum] Add new extractor
* Small update, force CI
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-13 03:54:51 +00:00
dirkf
e8198c517b
[YouTube] Fix tests
2023-02-13 03:54:51 +00:00
dirkf
bafb6dec72
[YouTube] Refresh compat/utils usage
...
* import parse_qs()
* import parse_qs in lazy_extractors (clears old TODO)
* clean up old compiled lazy_extractors for Py2
* use update_url()
2023-02-13 03:54:51 +00:00
dirkf
30e986b834
[YouTube] Add signatureTimestamp
for age-gate bypass
2023-02-13 03:54:51 +00:00
dirkf
58988c1421
[YouTube] Bypass age-gating for certain restricted videos
...
* Use TVHTML5_SIMPLY_EMBEDDED_PLAYER client
* Also add and fix tests
* Introduce and use new utility function `update_url()`
2023-02-13 03:54:51 +00:00
dirkf
e19ec52322
[Vimeo] Support /user{video_id}/{slug} URL format
2023-02-12 22:16:00 +00:00
dirkf
f2f90887ca
[Vimeo] Fix Unable to extract info section
redux
...
* as reported in yt-dlp/yt-dlp#6149
* also allow newline in target JSON object
2023-02-12 22:16:00 +00:00
dirkf
d947ffe8e3
[IGN] Overhaul extractor to avoid URL redirection loop
...
Consequently/also:
* centralise video data extraction
* detect 404 and 503 expected errors
* handle the test video in IGNVideo
* handle two additional page formats for the tests in IGNArticle
2023-02-12 22:16:00 +00:00
dirkf
384f632e8a
[ITV] Overhaul ITV extractor ( #30266 )
...
* support ITVX URLs (thanks Vangelis66)
* support legacy ITV Hub URLs
* include extraction fix 4c57dd2
from sleaux-meaux 3 May 2021
* include extraction fix 6fbcc16, fix by staubichsauger & pukkandan
* work-around duration parsing pending fix to utils.parse_duration
* apply default vanilla UA for pages and media to avoid site blocking
* also detect and report `Episode not found` instead of generic 404
* rework ITVBTCCIE with geo-block detection, best effort geo-restriction handling, news article support
* fix tests
2023-02-03 21:10:07 +00:00
dirkf
9d17948b5a
[myvideoge] Add new extractor ( #31360 )
...
NB download tests on CI servers blocked
Co-authored-by: Alfonso Solbes <fonk666@gmail.com>
2023-02-02 23:25:44 +00:00
afterdelight
f316f5d4e3
[xhamster] add support for new domain xhvid.com ( #31370 )
2023-02-02 23:20:14 +00:00
dirkf
bc6f94e459
[FIFA] Back-port extractor from yt-dlp ( #31385 )
2023-02-02 23:19:03 +00:00
Epsilonator
be3392a0d4
[Blerp] Add new extractor ( #31398 )
...
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02 17:33:09 +00:00
zhangeric-15
6d829d8119
[YouTube] Fix not finding videos listed under a channel's "shorts" subpage. ( #31409 )
...
Resolves #31336
Co-authored-by: Jouni Järvinen <rautamiekka@users.noreply.github.com>
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02 17:26:31 +00:00
Ruowang Sun
98b0cf1cd0
[Callin] Add new extractor ( #31414 )
...
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02 17:21:05 +00:00
Leon Etienne
e9611a2a36
[pr0gramm] implement InfoExtractor, Resolves #31433 ( #31434 )
...
* [pr0gramm] implement infoextractor
* [pr0gramm] remove misplaced comment, uncapture regex-group
* [pr0gramm]: specify utf-8 coding
* [pr0gramm]: add trailing comma to lists for maintainability
* [pr0gramm]: ie only sets upload_date attribute
* [pr0gramm]: add video_id to title
* [pr0gramm]: more forgiving _valid_url regex
* [pr0gramm]: add uploader to title, if set
* Discriminate URL pattern
---------
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02 17:13:39 +00:00
JChris246
807e593a32
[cammodels] fix and improve extractor ( #31453 )
...
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02 17:12:36 +00:00
Brian Marks
37cbdfa0e7
[americastestkitchen] Add support for downloading entire series ( #31493 )
...
Also
* support new sites and URL patterns
* back-port from yt-dlp
Co-authored-by: dirkf <fieldhouse@gmx.net>
2023-02-02 16:58:21 +00:00
dirkf
195f22f679
[generic] Improve KVS (etc) extraction
2022-11-13 15:09:29 +00:00
dirkf
fc2beab0e7
[generic] Improve KVS (etc) extraction
...
* detect kt_player('kt_player', 'https://.../kt_player.swf?v=5 ...
* detect age limit if 18 USC 2257 is mentioned
* test with shooshtime.com
Partially resolves #31332 .
2022-11-13 14:59:30 +00:00
FraFraFra-LongD
1a4fbe8462
Added ThisVid.com support ( #29187 )
...
* add ThisVidIE, ThisVidMemberIE, ThisVidPlaylistIE
* redirect embed to main page for more metadata
* use KVS extraction newly added to GenericIE and remove duplicate tests
* also add MrDeepFake etc compat to GenericIE
(closes #22390 )
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-11-13 13:22:04 +00:00
dirkf
c2f9be3e63
[generic] Add KVS player extraction
2022-11-12 11:55:05 +00:00
dirkf
604762a9f8
[common:jwplayer] Improve jwplayer extraction and parsing ( #31000 )
...
* don't crash parser if jwplayer_data is invalid (empty, or no formats)
* use `label` in `sources[n]` as `format_id`
* relax `jwplayer().setup(...)` RE (also rework PR #27274 enhancement)
* detect more manifest formats in _parse_jwplayer_formats() (from PR #29596 )
* improve metadata extraction (from PR #25433 )
* remember URLs in a set
* use parse_resolution() in format
* extract filesize in format (from yt-dlp)
Co-authored-by: kikuyan <kikuyan@users.noreply.github.com>
Co-authored-by: martin54 <martin54@users.noreply.github.com>
2022-11-11 00:49:13 +00:00
Moises Lima
47e70fff8b
[PeekVids, PlayVids] Add new extractor ( #29765 )
...
* Merge back-port from yt-dlp
* Merge features from PR #29798
* Improve metadata extraction
Co-authored-by: dirkf <fieldhouse@gmx.net>
Co-authored by: AXDOOMER
2022-11-09 20:26:30 +00:00
dirkf
de39d1281c
[extractor/ceskatelevize] Back-port extractor from yt-dlp, etc ( #30713 )
...
* back-port extractor, removing CeskaTelevizePoradyIE
* follow redirect URL
* support liveBroadcast and videobonusDetail in __NEXT__ data
* return single video for singleton playlist
* fix/add tests
2022-11-04 10:13:07 +00:00
Xie Yanbo
ce5d36486e
[netease] Support urls shared from mobile app ( #31304 )
...
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-30 11:48:44 +00:00
Xie Yanbo
d25cf62086
[netease] Impove error handling ( #31303 )
...
* add warnings for users outside of China
* skip empty song urls
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-30 11:46:46 +00:00
dirkf
502cefa41f
[Vimeo] Update variable name in hydration JSON pattern
...
Fixes #31311
2022-10-27 14:33:00 +00:00
dirkf
0faa45d6c0
[BongaCams] Support new .net domain
...
Resolves #31262 .
2022-10-20 11:06:44 +00:00
ache
447edc48e6
Fix ADN extractor ( #31275 )
...
* Rename Anime Digital Network to Animation Digital Network, animationdigitalnetwork.fr
* Update the test to an available video
* Update the decoding key of subtitles
* Keep the support of old URLs
* Add a test to match the old URL
* Reduce redundancy of the URL name
* Fix md5 ^^"
* Fix undefined _BASE
* Process HTTP error text (eg geo-block) correctly and uniformly in Py3, Py2
* Skip test for CI since geo-blocked
Signed-off-by: ache <ache@ache.one>
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-18 16:06:27 +01:00
dirkf
ee8560d01e
[ManyVids] Support new single-page app structure
2022-10-13 02:42:49 +00:00
dirkf
7135277fec
[ManyVids] Support new single-page app structure
...
See https://github.com/yt-dlp/yt-dlp/issues/5210#issuecomment-1276919962 .
2022-10-13 01:59:01 +00:00
dirkf
7bbd5b13d4
[Motherless] Pull from yt-dlp, etc
...
* use username field
* loosen regexes
* warn on page count 0 in group
* avoid reloading group page 1
Closes #29626
2022-10-12 01:09:55 +01:00
Xie Yanbo
c91cbf6072
[netease] Get netease music download url through player api ( #31235 )
...
* remove unplayable song from test
* compatible with python 2
* using standard User_Agent, fix imports
* use hash instead of long description
* fix lint
* fix hash
2022-10-11 13:55:09 +01:00
dirkf
11b284c81f
[Common:JWPlayer] Fix x1000 scaling error
...
See https://github.com/yt-dlp/yt-dlp/issues/5106#issuecomment-1264625161
2022-10-11 12:36:44 +00:00
dirkf
c282e5f8d7
[ZDF] Overhaul ZDF extractors
...
* pull some yt-dlp changes into ZDFBaseIE._extract_format()
* add test cases from yt-dlp to ZDFIE
* fix crash in ZDFIE._extract_mobile() when object had no `formitaeten`
* improve title extraction in ZDFChannelIE (remove trailing station ident)
* avoid extracting non-video playlist items (fixes #31149 )
2022-10-11 00:05:17 +01:00
Xiyue
82e4eca711
[motherless] Fixed the broken uploader_id in the extractor ( #31243 )
...
* Fixed the broken uploader_id in the extractor.
* Make uploader_id RE looser
* Fix uploader_id in test Motherless_3
* Fix group pagination
* # coding: utf-8
Co-authored-by: Andy Xuming <xuminic@gmail.com>
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-10-10 23:52:48 +01:00
dirkf
1b1442887e
[manyvids] Improve extraction ( #31172 )
...
* extract all formats from page
* extract description, uploader, views, likes
* downrate previews
* fix tests
* use txt_or_none()
2022-10-10 19:26:32 +01:00
dirkf
22127b271c
[NRK] Remove explicit Accept-Encoding header that invites Brotli
...
Fixes #31285
2022-10-10 17:41:40 +00:00
coletdjnz
d35557a75d
[Telegraaf] Use mobile GraphQL API endpoint
...
Workaround for Cloudflare 403
Fixes https://github.com/yt-dlp/yt-dlp/issues/5000
Authored by: coletdjnz
2022-10-04 11:43:08 +01:00
dirkf
573b13410e
[YouTube] Improve error check for n-sig processing
2022-08-25 12:14:59 +01:00
gudata
a8d5316aaf
[infoq] Avoid crash if the page has no mp3Form
...
* proposed fix for issue #31131 , aligns with yt-dlp
Co-authored-by: dirkf <fieldhouse@gmx.net>
2022-08-19 21:00:21 +01:00
dirkf
fd3f3bebd0
[uktvplay] Support domain without .uktv
2022-08-19 19:11:08 +01:00
dirkf
deee741fb1
[test, etc] Improve download test logs; also clean up some new flake8 issues ( #31153 )
...
* [test] Identify testcase errors better
* [test] Identify download errors better
* [extractor/minds] Linter
* [extractor/aes] Linter
2022-08-09 21:05:00 +01:00
Wes
adb5294177
[aenetworks] Update _THEPLATFORM_KEY and _THEPLATFORM_SECRET ( #29749 )
...
Fixes ytdl-org/youtube-dl#29300
2022-07-30 02:10:00 +01:00
Kyraminol Endyeran
5f5c127ece
[VVVVID] Support video/dash types ( #31060 )
...
Resolves #31030 .
2022-07-12 00:35:40 +01:00
dirkf
a03b9775d5
[Mediaset] Support player version number in URL pattern
...
Ref: https://github.com/yt-dlp/yt-dlp/issues/4141
2022-06-26 14:24:06 +01:00
dirkf
8a158a936c
[NHK] Use new API URL
2022-06-15 18:28:19 +01:00
dirkf
cc179df346
[XHamster] Support xhday.com alias, extract uploader_id
...
* support xhday.com alias for xhamster.com (resolves #31023 )
Authored by: dirkf
* extract `uploader_id`:
from 908b56eaf7
(PR https://github.com/yt-dlp/yt-dlp/pull/844 )
Authored by: octotherp
2022-06-12 14:10:38 +01:00
pukkandan
0700fde640
[utils, etc] Kill child processes when yt-dl is killed
...
* derived from PR #26592 , closes #26592
Authored by: Unrud
2022-06-10 19:57:46 +01:00
dirkf
811c480f7b
[YouTube] Support JSON3 subtitle format
...
* subtitle tests updated to match
2022-06-09 15:25:23 +01:00
dirkf
530f4582d0
[HRFernsehen] Back-port new extractor from yt-dlp
...
Closes #26445 , where this was originally proposed.
2022-06-06 19:29:48 +01:00
dirkf
04fd3289d3
[YouPorn] Improve upload_date
extraction
...
See https://github.com/yt-dlp/yt-dlp/issues/2701#issuecomment-1034341883
2022-05-28 13:54:32 +01:00
dirkf
187a48aee2
[YouTube] Handle player c5a4daa1 with indirect n-function definition
...
* resolves #30976
2022-05-24 15:43:56 +01:00
dirkf
c3deca86ae
[wat.tv] Add version pver
to metadata API call
...
Resolves #30959 .
2022-05-19 17:41:48 +00:00
dirkf
c7965b9fc2
[NHK] Support alphabetic characters in 7-char NhkVod IDs ( #29682 )
2022-05-09 18:54:41 +01:00
dirkf
e27d8d819f
[streamcz] Remove empty '{}'.format()
for Py2.6
...
Use `'-join()'` here, or `{0}`, ..., in general.
2022-04-29 13:36:02 +01:00
Árni Dagur
ebc627847c
[KTH] Add new extractor for KTH play ( #30885 )
...
* Implement extractor for KTH play
* Make KTH Play url regex more relaxed
2022-04-28 10:18:10 +01:00
dirkf
a0068bd6be
[Youtube] Fix "n" descrambling for player fae06c11
...
Resolves #30856 .
2022-04-15 16:07:09 +01:00
nixxo
871645a4a4
[RAI] Fix extraction of http formats
...
From https://github.com/yt-dlp/yt-dlp/pull/3272
Closes https://github.com/yt-dlp/yt-dlp/issues/3270
Authored by: nixxo
2022-04-05 15:21:59 +01:00
nixxo
1f50a07771
[RAI] Extend formats with direct http mp4 link (PR #27990 )
...
* initial support for creating direct mp4 link
* improved regexes and info extraction
* added "connection: close" to request headers
* updated to https://github.com/yt-dlp/yt-dlp/pull/208
2022-04-05 15:21:59 +01:00
nixxo
9e5ca66f16
[RAI] Added checks for DRM protected content (PR #27657 )
...
reviewed by pukkandan (https://github.com/yt-dlp/yt-dlp/pull/150 )
2022-04-05 15:21:59 +01:00
lihan7
17d295a1ec
[extractor/bilibili] Fix path "/audio/auxxxxx" download return 403
2022-04-01 00:46:34 +01:00
dirkf
4194d253c0
Avoid skipping ID when unlisted_hash is numeric
...
Pattern needed a non-greedy match; also replaced a redundant test with one for this, issue 29690
2022-02-26 10:29:42 +00:00
dirkf
f8e543c906
[Alsace20TV] Add new extractors Alsace20TVIE, Alsace20TVEmbedIE
2022-02-24 18:43:47 +00:00
dirkf
c4d1738316
[CPAC] Add extractor for Canadian Parliament
...
CPACIE: single episode
CPACPlaylistIE: playlists and searches
2022-02-24 18:27:57 +00:00
dirkf
1f13ccfd7f
Fixed groups() call on potentially empty regex search object ( #30676 )
...
* Fixed groups() call on potentially empty regex search object.
- https://github.com/ytdl-org/youtube-dl/issues/30521
* minimising lines changed
Co-authored-by: yayorbitgum <50963144+yayorbitgum@users.noreply.github.com>
2022-02-24 18:26:58 +00:00
marieell
923292ba64
[aliexpress] Fix test case
2022-02-24 13:44:52 +00:00
Lesmiscore (Naoya Ozaki)
782bfd26db
[bigo] add support for bigo.tv ( #30635 )
...
* [bigo] add support for bigo.tv
* [bigo] prepend "Bigo says"
* title fallback
* add error for invalid json data
2022-02-24 13:34:32 +00:00
Vladimir Stavrinov
3472227074
[rutv] fix vbr for empty string value ( #30623 )
...
* [rutv] use str_to_int() (thx dirkf)
2022-02-14 17:54:31 +00:00
Petr Vaněk
bf23bc0489
add missing __future__ import unicode_literals
2022-02-14 07:07:05 +00:00
Petr Vaněk
85bf26c1d0
resolve problem with unpacking operator for <py3.5
2022-02-14 07:07:05 +00:00
Petr Vaněk
d8adca1b66
[streamcz] test fixes and one additional test
2022-02-14 07:07:05 +00:00
Petr Vaněk
d02064218b
do not use f-strings
2022-02-14 07:07:05 +00:00