Commit Graph

975 Commits

Author SHA1 Message Date
Sergey M․
03d3af9768
[test_InfoExtractor] PEP 8 2020-12-13 23:47:13 +07:00
Sergey M․
1727541315
[extractor/common] Improve JSON-LD interaction statistic extraction (refs #23306) 2020-12-13 20:24:13 +07:00
Sergey M․
5a1fbbf8b7
[extractor/common] Fix inline HTML5 media tags processing and add test (closes #27345) 2020-12-09 00:05:21 +07:00
Sergey M․
191286265d
[youtube:tab] Fix feeds extraction (closes #25695, closes #26452) 2020-11-24 00:10:25 +07:00
Josh Soref
71ddc222ad
Fix typos (#27084)
* spelling: authorization

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: brightcove

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: creation

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: exceeded

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: exception

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: extension

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: extracting

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: extraction

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: frontline

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: improve

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: length

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: listsubtitles

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: multimedia

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: obfuscated

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: partitioning

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: playlist

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: playlists

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: restriction

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: services

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: split

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: srmediathek

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: support

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: thumbnail

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: verification

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>

* spelling: whitespaces

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-11-21 22:00:05 +07:00
Sergey M․
ab0eda99e1
[YoutubeDL] Fix --ignore-errors for playlists with generator-based entries of url_transparent (closes #27064) 2020-11-21 04:00:08 +07:00
Sergey M․
2864179293
[youtube] Improve extraction
+ Add support for --no-playlist (closes #27009)
* Improve playlist and mix extraction (closes #26390, closes #26509, closes #26534, closes #27011)
+ Extract playlist uploader data
* Update tests
2020-11-18 03:34:08 +07:00
Sergey M․
fe07e788bf
[utils] Skip ! prefixed code in js_to_json 2020-11-17 01:30:43 +07:00
Sergey M․
2de2ca6659
[youtube] Rework extractors
WIP
2020-11-12 06:16:37 +07:00
Kevin O'Connor
4eda10499e
[utils] Don't attempt to coerce JS strings to numbers in js_to_json (#26851)
The current logic in `js_to_json` tries to rewrite octal/hex numbers to
decimal. However, when the logic actually happens the `"` or `'` have
already been trimmed off. This causes what were originally strings, that
happen to look like octal/hex numbers, to get rewritten to decimal and
returned as a number rather than a string.

In practive something like:

```js
{
  "0x40": "foo",
  "040": "bar",
}
```

would get rewritten as:

```json
{
  64: "foo",
  32: "bar
}
```

This is problematic since this isn't valid JSON as you cannot have
non-string keys.
2020-10-18 00:10:41 +07:00
Sergey M․
1d9bf655e6
[utils] Recognize wav mimetype (closes #26463) 2020-09-06 11:19:53 +07:00
Sergey M․
84213ea8d4
[youtube] Extract chapters from JSON (closes #24819) 2020-06-06 04:22:10 +07:00
Sergey M․
c380cc28c4
[utils] Improve cookie files support
+ Add support for UTF-8 in cookie files
* Skip malformed cookie file entries instead of crashing (invalid entry len, invalid expires at)
2020-05-05 04:21:25 +07:00
Sergey M․
e40c758c2a
[youtube] Improve player id extraction and add tests 2020-05-02 07:18:08 +07:00
Sergey M․
042b664933
Revert "[utils] Add support for cookies with spaces used instead of tabs"
According to [1] TABs must be used as separators between fields.
Files produces by some tools with spaces as separators are considered
malformed.

1. https://curl.haxx.se/docs/http-cookies.html

This reverts commit cff99c91d1.
2020-03-10 04:53:51 +07:00
Sergey M․
cff99c91d1
[utils] Add support for cookies with spaces used instead of tabs 2020-03-08 18:01:32 +07:00
Sergey M․
ea17979d83
[test_subtitles] Remove obsolete test 2020-02-29 22:08:43 +07:00
Sergey M․
4e9e1e240d
[test_YoutubeDL] Add tests for #10591 (closes #23873) 2020-02-15 03:37:31 +07:00
Sergey M․
e0abaab293
[test_YoutubeDL] Fix get_ids 2020-02-15 03:37:25 +07:00
Sergey M․
42db58ec73
[utils] Improve str_to_int 2019-12-15 23:15:24 +07:00
Remita Amine
348c6bf1c1 [utils] handle int values passed to str_to_int 2019-11-29 17:39:18 +01:00
Sergey M․
1ced222120
[utils] Add generic caesar cipher and rot47 2019-11-27 02:26:42 +07:00
InfernalUnderling
9d30c2132a [utils] Handle rd-suffixed day parts in unified_strdate (#23199) 2019-11-27 00:08:37 +07:00
Remita Amine
237513e801 [yahoo] restore support for cbs suffixed URLs 2019-10-31 07:38:53 +01:00
Sergey M․
824fa51165
[utils] Improve subtitles_filename (closes #22753) 2019-10-18 04:03:53 +07:00
Sergey M․
28cc2241e4
[utils] Restrict parse_codecs and add theora as known vcodec (#21381) 2019-06-14 01:56:17 +07:00
Sergey M․
53cd37bac5
[utils] Improve strip_or_none 2019-05-24 00:03:01 +07:00
Sergey M․
3089bc748c
Fix W504 and disable W503 (closes #20863) 2019-05-11 03:57:40 +07:00
Jakub Wilk
fd35d8cdfd [utils] Transliterate "þ" as "th" (#20897)
Despite visual similarity "þ" is unrelated to "p".
It is normally transliterated as "th":

    $ echo þ-Þ | iconv -t ASCII//TRANSLIT
    th-TH
2019-05-11 01:42:31 +07:00
Sergey M․
5e1271c56d
[utils] Improve int_or_none and float_or_none (#20403) 2019-03-23 01:08:54 +07:00
Sergey M․
d493f15c11
[extractor/common] Improve HTML5 entries extraction and add some realworld tests 2019-03-17 09:09:32 +07:00
Sergey M․
0dc41787af
[utils] Introduce parse_bitrate 2019-03-17 09:07:47 +07:00
Sergey M․
2e27421c70
[test_InfoExtractor] Add test for #20346 2019-03-15 01:20:24 +07:00
Sergey M․
067aa17edf
Start moving to ytdl-org 2019-03-11 04:00:54 +07:00
Sergey M․
fca9baf0da
[test] Fix test_compat_etree_Element 2019-03-06 02:46:26 +07:00
Sergey M․
399f76870d
[compat] Introduce compat_etree_Element 2019-03-06 01:18:52 +07:00
remitamine
e7e62441cd [utils] strip #HttpOnly_ prefix from cookies files (#20219) 2019-03-03 19:23:59 +07:00
Ales Jirasek
22f5f5c6fc
[malltv] Add extractor (closes #18058) 2019-02-08 00:43:26 +07:00
Sergey M․
e118a8794f
[YoutubeDL] Fix typo in string negation implementation and add more tests (closes #18961) 2019-01-24 01:39:39 +07:00
Sergey M․
fad4ceb534
[utils] Fix urljoin for paths with non-http(s) schemes 2019-01-20 20:22:19 +07:00
Remita Amine
fc746c3fdd [test/test_InfoExtractor] add test for #18923 2019-01-20 09:05:12 +01:00
Sergey M․
2cc779f497
[YoutubeDL] Add negation support for string comparisons in format selection expressions (closes #18600, closes #18805) 2019-01-20 13:48:49 +07:00
Sergey M․
a16c7c033a
[test/helper] Add support for maxcount and count collection len test checkers 2019-01-16 02:17:49 +07:00
Sergey M․
6e29458f24
[test/testdata/cookies/session_cookies.txt] Fix empty expires test data 2018-12-10 04:30:00 +07:00
Sergey M․
9e02c2c704
[YoutubeDLCookieJar] Add test for keeping session cookies 2018-12-09 22:57:00 +07:00
Sergey M․
6864855eb1
[tests] Fix invalid escape sequences 2018-11-23 00:43:42 +07:00
Xiao Di Guan
95e42d7336 [extractor/common] Ensure response handle is not prematurely closed before it can be read if it matches expected_status (resolves #17195, closes #17846, resolves #17447) 2018-11-03 01:18:20 +07:00
Sergey M․
25d110be30
[utils] Properly recognize AV1 codec (closes #17506) 2018-09-10 02:37:22 +07:00
Sergey M․
af03000ad5
[utils] Introduce url_or_none 2018-07-21 18:03:58 +07:00
Sergey M․
e9c671d5e8
[utils] Allow JSONP with empty func name (closes #17028) 2018-07-21 12:30:18 +07:00