YoutubeDL uses a regexp in common.py::_find_jwplayer_data
to find the jwplayer options. However the options are found in a
javascript function. For example the regexp might match this
jwplayer('some_string').setup({
/** Other attributes */
sources: {
file: "<url of video>",
label: "<title of video>",
type: "mp4"
}
});
Since this a valid javascript function, some websites write the
options as
var src = {
file: "<url of video>",
label: "<title of video>",
type: "mp4"
}
jwplayer('some_string').setup({
/** Other attributes */
sources: src
});
In this case YoutubeDL won't be able to retrieve sources.file, since
the regexp only matches the ".setup(...)" and ignores the "var src
= ..." assignment.
This commit makes YoutubeDL raise an ExtractorError in the above
case. YoutubeDL will then try alternative methods to retrieve the URL
of the video.
Has a "data-brightcove-video-id" instead of a "data-video-id," otherwise
pretty much just Brightcove. Except the Globe isn't all Brightcove
videos, so fallback to Generic, too.
Also, abstract playlist_from_matches() from generic.py to common.py, and use
it here.
History of these changes can be found in
51170427d4b1143572a498dedaee61863a5b2c5b.
* Rename options to preffixly match with --geo-verification-proxy
* Introduce _GEO_COUNTRIES for extractors
* Implement faking IP right away for sites with known geo restriction
* [infoq] Add audio only format if available
Refactor cookie code into a function.
Renamed formats to http_video, http_audio, rtmp_video
Renamed extract functions to video instead of videos as they return
one or no video.
* [infoq] Rename to _extract_cookies as it more than one
* [infoq] Remove redundant determine_ext
* [infoq] Add comment about hardcoded URL
* [infoq] Use _hidden_inputs instead of messy regex
* [infoq] Probe if audio URL is valid
Make it possible to pass headers to _is_valid_url
* [infoq] Add audio only test
Ref: #10625
In a strict sense, <track>s with kind=captions are not subtitles. [1]
openload misuses this attribute, and I guess there will be more
examples, so I add it to common.py.
Also allow extracting information for subtitles-only <video> or <audio>
tags, which is the case of openload.
[1] https://www.w3.org/TR/html5/embedded-content-0.html#attr-track-kind
- Eliminate segment_urls and initialization_url
+ Introduce manifest_url (manifest may contain unfragmented data in this case url will be used for direct media URL and manifest_url for manifest itself correspondingly)
* Rewrite dashsegments downloader to use fragments data
* Improve generic mpd extraction