Sergey M․
17bcc626bf
[utils] Extract sanitize_url routine
2016-03-26 19:33:57 +06:00
Sergey M․
15707c7e02
[compat] Add compat_urllib_parse_urlencode and eliminate encode_dict
...
encode_dict functionality has been improved and moved directly into compat_urllib_parse_urlencode
All occurrences of compat_urllib_parse.urlencode throughout the codebase have been replaced by compat_urllib_parse_urlencode
Closes #8974
2016-03-26 01:46:57 +06:00
Yen Chi Hsuan
622d19160b
[utils] Clarify Python versions affected by buggy struct module
2016-03-24 18:06:15 +08:00
Yen Chi Hsuan
efbed08dc2
[utils] Encode hostnames before passing to urllib
...
With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError
Fixes #8890
2016-03-23 22:24:52 +08:00
Jaime Marquínez Ferrándiz
782b1b5bd1
[utils] lookup_unit_table: Match word boundary instead of end of string
2016-03-19 11:44:49 +01:00
Jaime Marquínez Ferrándiz
09fc33198a
utils: lookup_unit_table: Use a stricter regex
...
In parse_count multiple units start with the same letter, so it would match different units depending on the order they were sorted when iterating over them.
2016-03-18 19:23:06 +01:00
Sergey M․
810c10baa1
[utils] Use compat_xpath
2016-03-18 02:52:23 +06:00
Sergey M․
c5229f3926
[utils] PEP 8
2016-03-16 21:50:04 +06:00
remitamine
83548824c2
Merge pull request #8092 from bpfoley/twitter-thumbnail
...
[utils] Add extract_attributes for extracting html tag attributes
2016-03-16 13:16:27 +01:00
Sergey M․
2f7ae819ac
[utils] PEP 8
2016-03-13 17:23:08 +06:00
Sergey M․
fb47597b09
[bbc] Generalize unit table lookup and add parse_count
2016-03-13 16:27:20 +06:00
Yen Chi Hsuan
25cb05bda9
[utils] Remove codec2ext
...
This function is orignally used for determining file extensions of DASH
formats. Now in DASH, ext is determined by mime_type. See #8766 for more
information.
2016-03-11 23:51:42 +08:00
Yen Chi Hsuan
6d210f2090
[utils] Add more codecs to codec2ext
...
BBC uses avc3. Here's an example (thanks to @remitamine for this example)
http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd
See also https://trac.ffmpeg.org/ticket/5217
2016-03-06 17:57:48 +08:00
Yen Chi Hsuan
19a17d4623
[utils] Add codec2ext
2016-03-05 18:18:28 +08:00
Jaime Marquínez Ferrándiz
3233a68fbb
[utils] update_url_query: Encode the strings in the query dict
...
The test case with {'test': '第二行тест'} was failing on python 2 (the non-ascii characters were replaced with '?').
2016-03-04 22:18:40 +01:00
remitamine
1255733945
Merge pull request #8739 from remitamine/update_url_params
...
[utils] add update_url_query function to create or update query string params
2016-03-03 19:24:04 +01:00
remitamine
38f9ef31dc
[utils] add update_url_query function
2016-03-03 18:34:52 +01:00
Yen Chi Hsuan
0cae023b24
Merge branch 'jython-support'
...
Closes #8302
2016-03-03 18:49:32 +08:00
Yen Chi Hsuan
8ee239e921
[utils] Jython support - handle filenames correctly
...
Now test:youtube downloads
2016-03-03 18:47:54 +08:00
Brian Foley
8bb56eeeea
[utils] Add extract_attributes for extracting html tag attributes
...
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
2016-03-03 10:11:37 +00:00
remitamine
e07237f640
[utils] remove check for val from find_xpath_attr
2016-03-02 21:40:21 +01:00
Yen Chi Hsuan
5eb6bdced4
[utils] Multiple changes to base_n()
...
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
2016-02-27 03:22:52 +08:00
Yen Chi Hsuan
680079be39
[utils] Relaxing regex in decode_packed_codes for vidzi
2016-02-26 15:13:03 +08:00
Yen Chi Hsuan
f52354a889
[utils] Move codes for handling eval() from iqiyi.py
2016-02-26 14:58:29 +08:00
Yen Chi Hsuan
59f898b7a7
[utils] Merge base_n functions
2016-02-26 14:37:20 +08:00
Yen Chi Hsuan
481888294d
[utils] Add base36 for use in Vidzi
2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
81bdc8fdf6
[utils] Move base62 to utils
2016-02-26 14:26:26 +08:00
Sergey M․
f160785c5c
[utils] Remove AM/PM from unified_strdate patterns
2016-02-25 00:52:49 +06:00
Yen Chi Hsuan
b95dc034ca
[utils] Implement cache for OnDemandPagedList
2016-02-23 13:11:20 +08:00
remitamine
cafcf657a4
add more subtitles mime types to mimetype2ext and fix the platform subtitle extraction
2016-02-20 22:02:03 +01:00
Yen Chi Hsuan
c1c05c67ea
[utils] Jython support - disable setproctitle() until ctypes is complete
2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
399a76e67b
[utils] Jython support: tolerate missing fcntl module
2016-02-21 03:32:03 +08:00
Jaime Marquínez Ferrándiz
765ac263db
[utils] mimetype2ext: return 'm4a' for 'audio/mp4' ( fixes #8620 )
...
The youtube extractor was using 'mp4' for them, therefore filters like 'bestaudio[ext=m4a]' stopped working (94278f7202
broke it).
2016-02-20 19:55:10 +01:00
Yen Chi Hsuan
5bc880b988
[utils] Add OHDave's RSA encryption function
2016-02-20 19:54:58 +08:00
Sergey M․
611c1dd96e
[refactor] Single quotes consistency
2016-02-14 15:37:17 +06:00
Sergey M․
d800609c62
[refactor] Do not specify redundant None as second argument in dict.get()
2016-02-14 14:25:04 +06:00
Sergey M․
9c7b38981c
[utils] Bump Firefox version in User-Agent
...
Old version number causes Youtube not to serve some formats in ytplayer.config
2016-02-11 23:12:30 +06:00
Sergey M․
8411229bd5
[utils] Allow dot in strip_jsonp
2016-02-07 19:47:09 +06:00
Sergey M․
86296ad2cd
[utils] Add ability to control skipping false values in dict_get
2016-02-07 08:13:04 +06:00
Sergey M․
cbecc9b903
[utils] Add dict_get convenience method
2016-02-07 06:12:53 +06:00
Jaime Marquínez Ferrándiz
87de7069b9
[utils] dfxp2srt: make TTMLPElementParser inherit from object
...
For consistency between python 2 and 3.
2016-02-02 22:30:13 +01:00
remitamine
2b14cb566f
[utils] fix dfxp2srt text extraction( fixes #8055 )
2016-01-28 12:38:34 +01:00
Yen Chi Hsuan
a0d8d704df
[utils] Reorder items in mimetype2ext alphabetically
2016-01-25 01:01:15 +08:00
Yen Chi Hsuan
f6861ec96f
[utils] Add more items to mimetype2ext ( #8293 )
...
These are used in Youtube formats
2016-01-25 00:58:53 +08:00
remitamine
6ec6cb4e95
Revert "fix typos"
...
This reverts commit 36a0e46c39
.
2016-01-10 19:27:22 +01:00
remitamine
36a0e46c39
fix typos
2016-01-10 17:55:41 +01:00
Jakub Wilk
dfb1b1468c
Fix typos
...
Closes #8200 .
2016-01-10 17:24:28 +01:00
Sergey M․
a7aaa39863
[utils] Extract known extensions for reuse
2016-01-04 01:08:34 +06:00
Yen Chi Hsuan
c047270c02
[utils] Remove Content-encoding from headers after decompression
...
With cn_verification_proxy, our http_response() is called twice, one from
PerRequestProxyHandler.proxy_open() and another from normal
YoutubeDL.urlopen(). As a result, for proxies honoring Accept-Encoding, the
following bug occurs:
$ youtube-dl -vs --cn-verification-proxy https://secure.uku.im:993 "test:letv"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-vs', '--cn-verification-proxy', 'https://secure.uku.im:993 ', 'test:letv']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.12.23
[debug] Git HEAD: 97f18fa
[debug] Python version 3.5.1 - Linux-4.3.3-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: ffmpeg 2.8.4, ffprobe 2.8.4, rtmpdump 2.4
[debug] Proxy map: {}
[TestURL] Test URL: http://www.letv.com/ptv/vplay/22005890.html
[Letv] 22005890: Downloading webpage
[Letv] 22005890: Downloading playJson data
ERROR: Unable to download JSON metadata: Not a gzipped file (b'{"') (caused by OSError('Not a gzipped file (b\'{"\')',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/common.py", line 330, in _request_webpage
return self._downloader.urlopen(url_or_request)
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1886, in urlopen
return self._opener.open(req, timeout=self._socket_timeout)
File "/usr/lib/python3.5/urllib/request.py", line 471, in open
response = meth(req, response)
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 773, in http_response
raise original_ioerror
File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 761, in http_response
uncompressed = io.BytesIO(gz.read())
File "/usr/lib/python3.5/gzip.py", line 274, in read
return self._buffer.read(size)
File "/usr/lib/python3.5/gzip.py", line 461, in read
if not self._read_gzip_header():
File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
2015-12-28 01:09:18 +08:00
Sergey M․
9b9c5355e4
Rename error_to_str to error_to_compat_str
2015-12-20 07:00:39 +06:00
Sergey M․
8e60dc7526
[utils] Add encode_compat_str
2015-12-20 06:26:26 +06:00
Sergey M․
fdae235858
[utils] Add error_to_str
2015-12-20 05:26:47 +06:00
Yen Chi Hsuan
db2fe38b55
[utils] Support alternative timestamp format in TTML
...
Fixes #7608
2015-12-19 19:29:51 +08:00
Yen Chi Hsuan
d631d5f9f2
[utils] Fix TTML conversion
...
Tolerate invalid timestamps (closes #7909 )
2015-12-19 18:21:42 +08:00
Sergey M․
31b2051e21
[utils] Add remove_quotes
2015-12-14 21:30:58 +06:00
Yen Chi Hsuan
992fc9d6e1
[utils] Refactor handle_youtubedl_headers for future extension
2015-11-29 12:58:29 +08:00
Yen Chi Hsuan
0424ec307b
[utils] Correct docstring of YoutubeDLHandler
2015-11-29 12:46:04 +08:00
Yen Chi Hsuan
87f0e62d94
[utils] Separate codes for handling Youtubedl-* headers
2015-11-29 12:42:50 +08:00
Sergey M․
67dda51722
Rename compat_urllib_request_Request to sanitized_Request and move to utils
2015-11-23 21:55:15 +06:00
Sergey M․
9cb9a5df77
[utils] Check ext with trailing slash against the list of known extensions
2015-11-22 17:27:13 +06:00
Sergey M․
3e12bc583a
[utils] Improve determine_ext ( Closes #7593 )
2015-11-22 06:29:39 +06:00
Sergey M․
7e1f5447e7
[utils] Improve encode_dict
2015-11-21 20:46:33 +06:00
Sergey M․
7a3f0c00ad
[utils] Style
2015-11-16 20:24:09 +06:00
Sergey M․
7aefc49c40
[utils] Skip invalid/non HTML entities ( Closes #7518 )
2015-11-16 20:20:16 +06:00
Jaime Marquínez Ferrándiz
6a75040278
[utils] unified_strdate: Return None if the date format can't be recognized ( fixes #7340 )
...
This issue was introduced with ae12bc3ebb
, it returned 'None'.
2015-11-02 14:08:38 +01:00
Sergey M․
c90d16cf36
[utils:sanitize_path] Disallow trailing whitespace in path segment ( Closes #7332 )
2015-11-02 04:26:20 +06:00
Sergey M
30eecc6a04
Merge pull request #7296 from jaimeMF/xml_attrib_unicode
...
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
2015-10-31 18:15:21 +00:00
Sergey M․
ae12bc3ebb
[utils] Make unified_strdate always return unicode string
2015-10-31 23:07:37 +06:00
Sergey M․
578c074575
[utils] Support list of xpath in xpath_element
2015-10-31 22:39:44 +06:00
Sergey M․
52c3a6e49d
[utils] Improve parse_iso8601
2015-10-28 21:40:22 +06:00
Jaime Marquínez Ferrándiz
f78546272c
[compat] compat_etree_fromstring: also decode the text attribute
...
Deletes parse_xml from utils, because it also does it.
2015-10-26 16:41:24 +01:00
Jaime Marquínez Ferrándiz
36e6f62cd0
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ( #7178 )
...
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
2015-10-25 20:13:16 +01:00
Sergey M․
d01949dc89
[utils:js_to_json] Fix bad escape in double quoted strings
2015-10-20 23:09:51 +06:00
Yen Chi Hsuan
1e399778ee
[letv] Fix extraction
...
Using data URIs for passing the decrypted M3U8 manifest, which is
supported by ffmpeg only.
2015-10-18 13:42:57 +08:00
Sergey M․
af98f8ff37
[utils] Return default on fail in int_or_none
2015-10-14 22:37:03 +06:00
Sergey M․
caf80631f0
[utils] Do not fail in float_or_none on non-numeric data
2015-10-14 22:36:37 +06:00
Sergey M․
1812afb7b3
[utils] Do not fail in int_or_none on non-numeric data ( Closes #7175 )
2015-10-14 22:35:01 +06:00
Sergey M․
5a1a2e9454
[utils] Fix kwargs on old python 2 ( Closes #6905 )
2015-09-20 21:08:29 +06:00
Sergey M․
e28034c5ac
[utils] Comment cookie processing until result from travis and some more testing
2015-09-06 08:16:39 +06:00
Sergey M․
266e466ee4
[utils] Simplify cookie processor
2015-09-06 07:53:11 +06:00
Sergey M․
1639282434
[utils] Add encode_dict
2015-09-06 07:22:20 +06:00
Sergey M․
ad72917274
[utils] Add issue URL in comment for #6457
2015-09-06 06:23:44 +06:00
Sergey M․
a6420bf50c
[utils] Add cookie processor for cookie correction ( Closes #6769 )
2015-09-06 06:20:48 +06:00
Sergey M․
66e289bab4
[utils] Generalize cli option converters
2015-09-05 03:05:11 +06:00
Sergey M․
8e636da499
[utils] Improve xpath_text
2015-09-05 00:34:49 +06:00
Sergey M․
5d2354f177
[utils] Relax attribute key assert
2015-09-04 23:57:27 +06:00
Sergey M․
a41fb80ce1
[utils] Add xpath_element and xpath_attr
2015-09-04 23:56:45 +06:00
Sergey M․
e5e78797e6
[utils] Strict HTTP responses ( Closes #6727 )
2015-09-02 02:16:04 +06:00
Sergey M․
5a4d9ddb21
[utils] Percent-encode redirect URL of Location header ( Closes #6457 )
2015-08-07 01:26:40 +06:00
Sergey M․
51f267d9d4
[YoutubeDL:utils] Move percent encode non-ASCII URLs workaround to http_request and simplify ( Closes #6457 )
2015-08-06 22:01:01 +06:00
Sergey M․
ee114368ad
[utils] Make value optional for find_xpath_attr
...
This allows selecting particular attributes by name but without specifying the value and similar to xpath syntax `[@attrib]`
2015-08-01 20:22:13 +06:00
Raphael Michel
2c7ed24796
Remove redundant (and wrong) class parameters
2015-07-26 16:37:51 +02:00
Yen Chi Hsuan
9c29bc69f7
[utils] Improve parse_duration
...
Now dots are parsed. For example '87 Min.'
2015-07-22 23:15:22 +08:00
Sergey M․
bf42a9906d
[utils] Add default value for xpath_text
2015-06-28 22:56:07 +06:00
Yen Chi Hsuan
4eb10f6621
[utils] Add ISO3166Utils
2015-06-27 13:13:57 +08:00
Yen Chi Hsuan
4e33577173
[utils] Support ttaf1 namespace in TTML
...
It's found in bbc.co.uk. See #6038
2015-06-21 19:24:39 +08:00
Yen Chi Hsuan
396726244a
[utils/ffmpeg] Move ISO 639 related codes to utils
2015-06-21 18:53:17 +08:00
Yen Chi Hsuan
ecee572411
[yahoo] Add support for closed captions ( closes #5714 )
2015-05-19 00:50:24 +08:00
Yen Chi Hsuan
1b0427e6c4
[utils] Support TTML without default namespace
...
In a strict sense such TTML is invalid, but Yahoo uses it.
2015-05-19 00:45:01 +08:00
Yen Chi Hsuan
c1c924abfe
[utils,common] Merge format_srt_time and _subtitles_timecode
...
format_srt_time uses a comma as the delimiter between seconds and
milliseconds while _subtitles_timecode uses a dot. All .srt examples I
found on the Internet uses a comma, so I use a comma in the merged
version. See http://matroska.org/technical/specs/subtitles/srt.html and
http://devel.aegisub.org/wiki/SubtitleFormats/SRT
2015-05-12 13:04:54 +08:00