Merge branch 'master' into peertube_extractor

This commit is contained in:
Parth Verma 2018-05-17 12:50:37 +05:30
commit fdb1961a17
27 changed files with 714 additions and 77 deletions

View File

@ -6,8 +6,8 @@
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.04.25*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.05.09*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.04.25** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.05.09**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -36,7 +36,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2018.04.25 [debug] youtube-dl version 2018.05.09
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@ -1,3 +1,47 @@
version 2018.05.09
Core
* [YoutubeDL] Ensure ext exists for automatic captions
* Introduce --geo-bypass-ip-block
Extractors
+ [udemy] Extract asset captions
+ [udemy] Extract stream URLs (#16372)
+ [businessinsider] Add support for businessinsider.com (#16387, #16388, #16389)
+ [cloudflarestream] Add support for cloudflarestream.com (#16375)
* [watchbox] Fix extraction (#16356)
* [discovery] Extract Affiliate/Anonymous Auth Token from cookies (#14954)
+ [itv:btcc] Add support for itv.com/btcc (#16139)
* [tunein] Use live title for live streams (#16347)
* [itv] Improve extraction (#16253)
version 2018.05.01
Core
* [downloader/fragment] Restart download if .ytdl file is corrupt (#16312)
+ [extractor/common] Extract interaction statistic
+ [utils] Add merge_dicts
+ [extractor/common] Add _download_json_handle
Extractors
* [kaltura] Improve iframe embeds detection (#16337)
+ [udemy] Extract outputs renditions (#16289, #16291, #16320, #16321, #16334,
#16335)
+ [zattoo] Add support for zattoo.com and mobiltv.quickline.com (#14668, #14676)
* [yandexmusic] Convert release_year to int
* [udemy] Override _download_webpage_handle instead of _download_webpage
* [xiami] Override _download_webpage_handle instead of _download_webpage
* [yandexmusic] Override _download_webpage_handle instead of _download_webpage
* [youtube] Correctly disable polymer on all requests (#16323, #16326)
* [generic] Prefer enclosures over links in RSS feeds (#16189)
+ [redditr] Add support for old.reddit.com URLs (#16274)
* [nrktv] Update API host (#16324)
+ [imdb] Extract all formats (#16249)
+ [vimeo] Extract JSON-LD (#16295)
* [funk:channel] Improve extraction (#16285)
version 2018.04.25 version 2018.04.25
Core Core

View File

@ -116,6 +116,9 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--geo-bypass-country CODE Force bypass geographic restriction with --geo-bypass-country CODE Force bypass geographic restriction with
explicitly provided two-letter ISO 3166-2 explicitly provided two-letter ISO 3166-2
country code (experimental) country code (experimental)
--geo-bypass-ip-block IP_BLOCK Force bypass geographic restriction with
explicitly provided IP block in CIDR
notation (experimental)
## Video Selection: ## Video Selection:
--playlist-start NUMBER Playlist video to start at (default is 1) --playlist-start NUMBER Playlist video to start at (default is 1)

View File

@ -122,6 +122,7 @@
- **BRMediathek**: Bayerischer Rundfunk Mediathek - **BRMediathek**: Bayerischer Rundfunk Mediathek
- **bt:article**: Bergens Tidende Articles - **bt:article**: Bergens Tidende Articles
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen - **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **BusinessInsider**
- **BuzzFeed** - **BuzzFeed**
- **BYUtv** - **BYUtv**
- **Camdemy** - **Camdemy**
@ -163,6 +164,7 @@
- **ClipRs** - **ClipRs**
- **Clipsyndicate** - **Clipsyndicate**
- **CloserToTruth** - **CloserToTruth**
- **CloudflareStream**
- **cloudtime**: CloudTime - **cloudtime**: CloudTime
- **Cloudy** - **Cloudy**
- **Clubic** - **Clubic**
@ -373,6 +375,7 @@
- **Ir90Tv** - **Ir90Tv**
- **ITTF** - **ITTF**
- **ITV** - **ITV**
- **ITVBTCC**
- **ivi**: ivi.ru - **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations - **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV - **ivideon**: Ivideon TV
@ -667,6 +670,8 @@
- **qqmusic:playlist**: QQ音乐 - 歌单 - **qqmusic:playlist**: QQ音乐 - 歌单
- **qqmusic:singer**: QQ音乐 - 歌手 - **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜 - **qqmusic:toplist**: QQ音乐 - 排行榜
- **Quickline**
- **QuicklineLive**
- **R7** - **R7**
- **R7Article** - **R7Article**
- **radio.de** - **radio.de**
@ -1092,6 +1097,8 @@
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication) - **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
- **Zapiks** - **Zapiks**
- **Zaq1** - **Zaq1**
- **Zattoo**
- **ZattooLive**
- **ZDF** - **ZDF**
- **ZDFChannel** - **ZDFChannel**
- **zingmp3**: mp3.zing.vn - **zingmp3**: mp3.zing.vn

View File

@ -286,6 +286,9 @@ class YoutubeDL(object):
Two-letter ISO 3166-2 country code that will be used for Two-letter ISO 3166-2 country code that will be used for
explicit geographic restriction bypassing via faking explicit geographic restriction bypassing via faking
X-Forwarded-For HTTP header (experimental) X-Forwarded-For HTTP header (experimental)
geo_bypass_ip_block:
IP range in CIDR notation that will be used similarly to
geo_bypass_country (experimental)
The following options determine which downloader is picked: The following options determine which downloader is picked:
external_downloader: Executable of the external downloader to call. external_downloader: Executable of the external downloader to call.
@ -1479,23 +1482,28 @@ class YoutubeDL(object):
if info_dict.get('%s_number' % field) is not None and not info_dict.get(field): if info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field]) info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
for cc_kind in ('subtitles', 'automatic_captions'):
cc = info_dict.get(cc_kind)
if cc:
for _, subtitle in cc.items():
for subtitle_format in subtitle:
if subtitle_format.get('url'):
subtitle_format['url'] = sanitize_url(subtitle_format['url'])
if subtitle_format.get('ext') is None:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
automatic_captions = info_dict.get('automatic_captions')
subtitles = info_dict.get('subtitles') subtitles = info_dict.get('subtitles')
if subtitles:
for _, subtitle in subtitles.items():
for subtitle_format in subtitle:
if subtitle_format.get('url'):
subtitle_format['url'] = sanitize_url(subtitle_format['url'])
if subtitle_format.get('ext') is None:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
if self.params.get('listsubtitles', False): if self.params.get('listsubtitles', False):
if 'automatic_captions' in info_dict: if 'automatic_captions' in info_dict:
self.list_subtitles(info_dict['id'], info_dict.get('automatic_captions'), 'automatic captions') self.list_subtitles(
info_dict['id'], automatic_captions, 'automatic captions')
self.list_subtitles(info_dict['id'], subtitles, 'subtitles') self.list_subtitles(info_dict['id'], subtitles, 'subtitles')
return return
info_dict['requested_subtitles'] = self.process_subtitles( info_dict['requested_subtitles'] = self.process_subtitles(
info_dict['id'], subtitles, info_dict['id'], subtitles, automatic_captions)
info_dict.get('automatic_captions'))
# We now pick which formats have to be downloaded # We now pick which formats have to be downloaded
if info_dict.get('formats') is None: if info_dict.get('formats') is None:

View File

@ -430,6 +430,7 @@ def _real_main(argv=None):
'config_location': opts.config_location, 'config_location': opts.config_location,
'geo_bypass': opts.geo_bypass, 'geo_bypass': opts.geo_bypass,
'geo_bypass_country': opts.geo_bypass_country, 'geo_bypass_country': opts.geo_bypass_country,
'geo_bypass_ip_block': opts.geo_bypass_ip_block,
# just for deprecation check # just for deprecation check
'autonumber': opts.autonumber if opts.autonumber is True else None, 'autonumber': opts.autonumber if opts.autonumber is True else None,
'usetitle': opts.usetitle if opts.usetitle is True else None, 'usetitle': opts.usetitle if opts.usetitle is True else None,

View File

@ -277,7 +277,9 @@ class AnvatoIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
self._initialize_geo_bypass(smuggled_data.get('geo_countries')) self._initialize_geo_bypass({
'countries': smuggled_data.get('geo_countries'),
})
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
access_key, video_id = mobj.group('access_key_or_mcp', 'id') access_key, video_id = mobj.group('access_key_or_mcp', 'id')

View File

@ -669,7 +669,10 @@ class BrightcoveNewIE(AdobePassIE):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
self._initialize_geo_bypass(smuggled_data.get('geo_countries')) self._initialize_geo_bypass({
'countries': smuggled_data.get('geo_countries'),
'ip_blocks': smuggled_data.get('geo_ip_blocks'),
})
account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups() account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups()

View File

@ -0,0 +1,42 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from .jwplatform import JWPlatformIE
class BusinessInsiderIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?businessinsider\.(?:com|nl)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://uk.businessinsider.com/how-much-radiation-youre-exposed-to-in-everyday-life-2016-6',
'md5': 'ca237a53a8eb20b6dc5bd60564d4ab3e',
'info_dict': {
'id': 'hZRllCfw',
'ext': 'mp4',
'title': "Here's how much radiation you're exposed to in everyday life",
'description': 'md5:9a0d6e2c279948aadaa5e84d6d9b99bd',
'upload_date': '20170709',
'timestamp': 1499606400,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.businessinsider.nl/5-scientifically-proven-things-make-you-less-attractive-2017-7/',
'only_matching': True,
}, {
'url': 'http://www.businessinsider.com/excel-index-match-vlookup-video-how-to-2015-2?IR=T',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
jwplatform_id = self._search_regex(
(r'data-media-id=["\']([a-zA-Z0-9]{8})',
r'id=["\']jwplayer_([a-zA-Z0-9]{8})',
r'id["\']?\s*:\s*["\']?([a-zA-Z0-9]{8})'),
webpage, 'jwplatform id')
return self.url_result(
'jwplatform:%s' % jwplatform_id, ie=JWPlatformIE.ie_key(),
video_id=video_id)

View File

@ -0,0 +1,60 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class CloudflareStreamIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
(?:watch\.)?cloudflarestream\.com/|
embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo=
)
(?P<id>[\da-f]+)
'''
_TESTS = [{
'url': 'https://embed.cloudflarestream.com/embed/we4g.fla9.latest.js?video=31c9291ab41fac05471db4e73aa11717',
'info_dict': {
'id': '31c9291ab41fac05471db4e73aa11717',
'ext': 'mp4',
'title': '31c9291ab41fac05471db4e73aa11717',
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://watch.cloudflarestream.com/9df17203414fd1db3e3ed74abbe936c1',
'only_matching': True,
}, {
'url': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/manifest/video.mpd',
'only_matching': True,
}]
@staticmethod
def _extract_urls(webpage):
return [
mobj.group('url')
for mobj in re.finditer(
r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo=[\da-f]+?.*?)\1',
webpage)]
def _real_extract(self, url):
video_id = self._match_id(url)
formats = self._extract_m3u8_formats(
'https://cloudflarestream.com/%s/manifest/video.m3u8' % video_id,
video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls',
fatal=False)
formats.extend(self._extract_mpd_formats(
'https://cloudflarestream.com/%s/manifest/video.mpd' % video_id,
video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats)
return {
'id': video_id,
'title': video_id,
'formats': formats,
}

View File

@ -346,6 +346,11 @@ class InfoExtractor(object):
geo restriction bypass mechanism right away in order to bypass geo restriction bypass mechanism right away in order to bypass
geo restriction, of course, if the mechanism is not disabled. (experimental) geo restriction, of course, if the mechanism is not disabled. (experimental)
_GEO_IP_BLOCKS attribute may contain a list of presumably geo unrestricted
IP blocks in CIDR notation for this extractor. One of these IP blocks
will be used by geo restriction bypass mechanism similarly
to _GEO_COUNTRIES. (experimental)
NB: both these geo attributes are experimental and may change in future NB: both these geo attributes are experimental and may change in future
or be completely removed. or be completely removed.
@ -358,6 +363,7 @@ class InfoExtractor(object):
_x_forwarded_for_ip = None _x_forwarded_for_ip = None
_GEO_BYPASS = True _GEO_BYPASS = True
_GEO_COUNTRIES = None _GEO_COUNTRIES = None
_GEO_IP_BLOCKS = None
_WORKING = True _WORKING = True
def __init__(self, downloader=None): def __init__(self, downloader=None):
@ -392,12 +398,15 @@ class InfoExtractor(object):
def initialize(self): def initialize(self):
"""Initializes an instance (authentication, etc).""" """Initializes an instance (authentication, etc)."""
self._initialize_geo_bypass(self._GEO_COUNTRIES) self._initialize_geo_bypass({
'countries': self._GEO_COUNTRIES,
'ip_blocks': self._GEO_IP_BLOCKS,
})
if not self._ready: if not self._ready:
self._real_initialize() self._real_initialize()
self._ready = True self._ready = True
def _initialize_geo_bypass(self, countries): def _initialize_geo_bypass(self, geo_bypass_context):
""" """
Initialize geo restriction bypass mechanism. Initialize geo restriction bypass mechanism.
@ -408,28 +417,82 @@ class InfoExtractor(object):
HTTP requests. HTTP requests.
This method will be used for initial geo bypass mechanism initialization This method will be used for initial geo bypass mechanism initialization
during the instance initialization with _GEO_COUNTRIES. during the instance initialization with _GEO_COUNTRIES and
_GEO_IP_BLOCKS.
You may also manually call it from extractor's code if geo countries You may also manually call it from extractor's code if geo bypass
information is not available beforehand (e.g. obtained during information is not available beforehand (e.g. obtained during
extraction) or due to some another reason. extraction) or due to some other reason. In this case you should pass
this information in geo bypass context passed as first argument. It may
contain following fields:
countries: List of geo unrestricted countries (similar
to _GEO_COUNTRIES)
ip_blocks: List of geo unrestricted IP blocks in CIDR notation
(similar to _GEO_IP_BLOCKS)
""" """
if not self._x_forwarded_for_ip: if not self._x_forwarded_for_ip:
country_code = self._downloader.params.get('geo_bypass_country', None)
# If there is no explicit country for geo bypass specified and # Geo bypass mechanism is explicitly disabled by user
# the extractor is known to be geo restricted let's fake IP if not self._downloader.params.get('geo_bypass', True):
# as X-Forwarded-For right away. return
if (not country_code and
self._GEO_BYPASS and if not geo_bypass_context:
self._downloader.params.get('geo_bypass', True) and geo_bypass_context = {}
countries):
country_code = random.choice(countries) # Backward compatibility: previously _initialize_geo_bypass
if country_code: # expected a list of countries, some 3rd party code may still use
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code) # it this way
if isinstance(geo_bypass_context, (list, tuple)):
geo_bypass_context = {
'countries': geo_bypass_context,
}
# The whole point of geo bypass mechanism is to fake IP
# as X-Forwarded-For HTTP header based on some IP block or
# country code.
# Path 1: bypassing based on IP block in CIDR notation
# Explicit IP block specified by user, use it right away
# regardless of whether extractor is geo bypassable or not
ip_block = self._downloader.params.get('geo_bypass_ip_block', None)
# Otherwise use random IP block from geo bypass context but only
# if extractor is known as geo bypassable
if not ip_block:
ip_blocks = geo_bypass_context.get('ip_blocks')
if self._GEO_BYPASS and ip_blocks:
ip_block = random.choice(ip_blocks)
if ip_block:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(ip_block)
if self._downloader.params.get('verbose', False):
self._downloader.to_screen(
'[debug] Using fake IP %s as X-Forwarded-For.'
% self._x_forwarded_for_ip)
return
# Path 2: bypassing based on country code
# Explicit country code specified by user, use it right away
# regardless of whether extractor is geo bypassable or not
country = self._downloader.params.get('geo_bypass_country', None)
# Otherwise use random country code from geo bypass context but
# only if extractor is known as geo bypassable
if not country:
countries = geo_bypass_context.get('countries')
if self._GEO_BYPASS and countries:
country = random.choice(countries)
if country:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country)
if self._downloader.params.get('verbose', False): if self._downloader.params.get('verbose', False):
self._downloader.to_screen( self._downloader.to_screen(
'[debug] Using fake IP %s (%s) as X-Forwarded-For.' '[debug] Using fake IP %s (%s) as X-Forwarded-For.'
% (self._x_forwarded_for_ip, country_code.upper())) % (self._x_forwarded_for_ip, country.upper()))
def extract(self, url): def extract(self, url):
"""Extracts URL information and returns it in list of dicts.""" """Extracts URL information and returns it in list of dicts."""

View File

@ -5,7 +5,10 @@ import re
import string import string
from .discoverygo import DiscoveryGoBaseIE from .discoverygo import DiscoveryGoBaseIE
from ..compat import compat_str from ..compat import (
compat_str,
compat_urllib_parse_unquote,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
try_get, try_get,
@ -55,15 +58,27 @@ class DiscoveryIE(DiscoveryGoBaseIE):
video = next(cb for cb in content_blocks if cb.get('type') == 'video')['content']['items'][0] video = next(cb for cb in content_blocks if cb.get('type') == 'video')['content']['items'][0]
video_id = video['id'] video_id = video['id']
access_token = self._download_json( access_token = None
'https://www.%s.com/anonymous' % site, display_id, query={ cookies = self._get_cookies(url)
'authRel': 'authorization',
'client_id': try_get( # prefer Affiliate Auth Token over Anonymous Auth Token
react_data, lambda x: x['application']['apiClientId'], auth_storage_cookie = cookies.get('eosAf') or cookies.get('eosAn')
compat_str) or '3020a40c2356a645b4b4', if auth_storage_cookie and auth_storage_cookie.value:
'nonce': ''.join([random.choice(string.ascii_letters) for _ in range(32)]), auth_storage = self._parse_json(compat_urllib_parse_unquote(
'redirectUri': 'https://fusion.ddmcdn.com/app/mercury-sdk/180/redirectHandler.html?https://www.%s.com' % site, compat_urllib_parse_unquote(auth_storage_cookie.value)),
})['access_token'] video_id, fatal=False) or {}
access_token = auth_storage.get('a') or auth_storage.get('access_token')
if not access_token:
access_token = self._download_json(
'https://www.%s.com/anonymous' % site, display_id, query={
'authRel': 'authorization',
'client_id': try_get(
react_data, lambda x: x['application']['apiClientId'],
compat_str) or '3020a40c2356a645b4b4',
'nonce': ''.join([random.choice(string.ascii_letters) for _ in range(32)]),
'redirectUri': 'https://fusion.ddmcdn.com/app/mercury-sdk/180/redirectHandler.html?https://www.%s.com' % site,
})['access_token']
try: try:
stream = self._download_json( stream = self._download_json(
@ -72,7 +87,7 @@ class DiscoveryIE(DiscoveryGoBaseIE):
'Authorization': 'Bearer ' + access_token, 'Authorization': 'Bearer ' + access_token,
}) })
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403: if isinstance(e.cause, compat_HTTPError) and e.cause.code in (401, 403):
e_description = self._parse_json( e_description = self._parse_json(
e.cause.read().decode(), display_id)['description'] e.cause.read().decode(), display_id)['description']
if 'resource not available for country' in e_description: if 'resource not available for country' in e_description:

View File

@ -102,7 +102,9 @@ class DPlayIE(InfoExtractor):
display_id = mobj.group('id') display_id = mobj.group('id')
domain = mobj.group('domain') domain = mobj.group('domain')
self._initialize_geo_bypass([mobj.group('country').upper()]) self._initialize_geo_bypass({
'countries': [mobj.group('country').upper()],
})
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)

View File

@ -137,6 +137,7 @@ from .brightcove import (
BrightcoveLegacyIE, BrightcoveLegacyIE,
BrightcoveNewIE, BrightcoveNewIE,
) )
from .businessinsider import BusinessInsiderIE
from .buzzfeed import BuzzFeedIE from .buzzfeed import BuzzFeedIE
from .byutv import BYUtvIE from .byutv import BYUtvIE
from .c56 import C56IE from .c56 import C56IE
@ -195,6 +196,7 @@ from .clippit import ClippitIE
from .cliprs import ClipRsIE from .cliprs import ClipRsIE
from .clipsyndicate import ClipsyndicateIE from .clipsyndicate import ClipsyndicateIE
from .closertotruth import CloserToTruthIE from .closertotruth import CloserToTruthIE
from .cloudflarestream import CloudflareStreamIE
from .cloudy import CloudyIE from .cloudy import CloudyIE
from .clubic import ClubicIE from .clubic import ClubicIE
from .clyp import ClypIE from .clyp import ClypIE
@ -477,7 +479,10 @@ from .internetvideoarchive import InternetVideoArchiveIE
from .iprima import IPrimaIE from .iprima import IPrimaIE
from .iqiyi import IqiyiIE from .iqiyi import IqiyiIE
from .ir90tv import Ir90TvIE from .ir90tv import Ir90TvIE
from .itv import ITVIE from .itv import (
ITVIE,
ITVBTCCIE,
)
from .ivi import ( from .ivi import (
IviIE, IviIE,
IviCompilationIE IviCompilationIE
@ -1419,5 +1424,11 @@ from .youtube import (
) )
from .zapiks import ZapiksIE from .zapiks import ZapiksIE
from .zaq1 import Zaq1IE from .zaq1 import Zaq1IE
from .zattoo import (
QuicklineIE,
QuicklineLiveIE,
ZattooIE,
ZattooLiveIE,
)
from .zdf import ZDFIE, ZDFChannelIE from .zdf import ZDFIE, ZDFChannelIE
from .zingmp3 import ZingMp3IE from .zingmp3 import ZingMp3IE

View File

@ -107,6 +107,7 @@ from .springboardplatform import SpringboardPlatformIE
from .yapfiles import YapFilesIE from .yapfiles import YapFilesIE
from .vice import ViceIE from .vice import ViceIE
from .xfileshare import XFileShareIE from .xfileshare import XFileShareIE
from .cloudflarestream import CloudflareStreamIE
class GenericIE(InfoExtractor): class GenericIE(InfoExtractor):
@ -1282,6 +1283,23 @@ class GenericIE(InfoExtractor):
}, },
'add_ie': ['Kaltura'], 'add_ie': ['Kaltura'],
}, },
{
# Kaltura iframe embed, more sophisticated
'url': 'http://www.cns.nyu.edu/~eero/math-tools/Videos/lecture-05sep2017.html',
'info_dict': {
'id': '1_9gzouybz',
'ext': 'mp4',
'title': 'lecture-05sep2017',
'description': 'md5:40f347d91fd4ba047e511c5321064b49',
'upload_date': '20170913',
'uploader_id': 'eps2',
'timestamp': 1505340777,
},
'params': {
'skip_download': True,
},
'add_ie': ['Kaltura'],
},
{ {
# meta twitter:player # meta twitter:player
'url': 'http://thechive.com/2017/12/08/all-i-want-for-christmas-is-more-twerk/', 'url': 'http://thechive.com/2017/12/08/all-i-want-for-christmas-is-more-twerk/',
@ -1454,21 +1472,6 @@ class GenericIE(InfoExtractor):
}, },
'expected_warnings': ['Failed to parse JSON Expecting value'], 'expected_warnings': ['Failed to parse JSON Expecting value'],
}, },
# Ooyala embed
{
'url': 'http://www.businessinsider.com/excel-index-match-vlookup-video-how-to-2015-2?IR=T',
'info_dict': {
'id': '50YnY4czr4ms1vJ7yz3xzq0excz_pUMs',
'ext': 'mp4',
'description': 'Index/Match versus VLOOKUP.',
'title': 'This is what separates the Excel masters from the wannabes',
'duration': 191.933,
},
'params': {
# m3u8 downloads
'skip_download': True,
}
},
# Brightcove URL in single quotes # Brightcove URL in single quotes
{ {
'url': 'http://www.sportsnet.ca/baseball/mlb/sn-presents-russell-martin-world-citizen/', 'url': 'http://www.sportsnet.ca/baseball/mlb/sn-presents-russell-martin-world-citizen/',
@ -1996,6 +1999,19 @@ class GenericIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
}, },
{
# CloudflareStream embed
'url': 'https://www.cloudflare.com/products/cloudflare-stream/',
'info_dict': {
'id': '31c9291ab41fac05471db4e73aa11717',
'ext': 'mp4',
'title': '31c9291ab41fac05471db4e73aa11717',
},
'add_ie': [CloudflareStreamIE.ie_key()],
'params': {
'skip_download': True,
},
},
{ {
'url': 'http://share-videos.se/auto/video/83645793?uid=13', 'url': 'http://share-videos.se/auto/video/83645793?uid=13',
'md5': 'b68d276de422ab07ee1d49388103f457', 'md5': 'b68d276de422ab07ee1d49388103f457',
@ -3008,6 +3024,11 @@ class GenericIE(InfoExtractor):
return self.playlist_from_matches( return self.playlist_from_matches(
xfileshare_urls, video_id, video_title, ie=XFileShareIE.ie_key()) xfileshare_urls, video_id, video_title, ie=XFileShareIE.ie_key())
cloudflarestream_urls = CloudflareStreamIE._extract_urls(webpage)
if cloudflarestream_urls:
return self.playlist_from_matches(
cloudflarestream_urls, video_id, video_title, ie=CloudflareStreamIE.ie_key())
sharevideos_urls = [mobj.group('url') for mobj in re.finditer( sharevideos_urls = [mobj.group('url') for mobj in re.finditer(
r'<iframe[^>]+?\bsrc\s*=\s*(["\'])(?P<url>(?:https?:)?//embed\.share-videos\.se/auto/embed/\d+\?.*?\buid=\d+.*?)\1', r'<iframe[^>]+?\bsrc\s*=\s*(["\'])(?P<url>(?:https?:)?//embed\.share-videos\.se/auto/embed/\d+\?.*?\buid=\d+.*?)\1',
webpage)] webpage)]

View File

@ -123,7 +123,7 @@ class GoIE(AdobePassIE):
'adobe_requestor_id': requestor_id, 'adobe_requestor_id': requestor_id,
}) })
else: else:
self._initialize_geo_bypass(['US']) self._initialize_geo_bypass({'countries': ['US']})
entitlement = self._download_json( entitlement = self._download_json(
'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json', 'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json',
video_id, data=urlencode_postdata(data)) video_id, data=urlencode_postdata(data))

View File

@ -7,6 +7,7 @@ import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from .brightcove import BrightcoveNewIE
from ..compat import ( from ..compat import (
compat_str, compat_str,
compat_etree_register_namespace, compat_etree_register_namespace,
@ -18,6 +19,7 @@ from ..utils import (
xpath_text, xpath_text,
int_or_none, int_or_none,
parse_duration, parse_duration,
smuggle_url,
ExtractorError, ExtractorError,
determine_ext, determine_ext,
) )
@ -41,6 +43,14 @@ class ITVIE(InfoExtractor):
# unavailable via data-playlist-url # unavailable via data-playlist-url
'url': 'https://www.itv.com/hub/through-the-keyhole/2a2271a0033', 'url': 'https://www.itv.com/hub/through-the-keyhole/2a2271a0033',
'only_matching': True, 'only_matching': True,
}, {
# InvalidVodcrid
'url': 'https://www.itv.com/hub/james-martins-saturday-morning/2a5159a0034',
'only_matching': True,
}, {
# ContentUnavailable
'url': 'https://www.itv.com/hub/whos-doing-the-dishes/2a2898a0024',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -127,7 +137,8 @@ class ITVIE(InfoExtractor):
if fault_code == 'InvalidGeoRegion': if fault_code == 'InvalidGeoRegion':
self.raise_geo_restricted( self.raise_geo_restricted(
msg=fault_string, countries=self._GEO_COUNTRIES) msg=fault_string, countries=self._GEO_COUNTRIES)
elif fault_code != 'InvalidEntity': elif fault_code not in (
'InvalidEntity', 'InvalidVodcrid', 'ContentUnavailable'):
raise ExtractorError( raise ExtractorError(
'%s said: %s' % (self.IE_NAME, fault_string), expected=True) '%s said: %s' % (self.IE_NAME, fault_string), expected=True)
info.update({ info.update({
@ -251,3 +262,38 @@ class ITVIE(InfoExtractor):
'subtitles': subtitles, 'subtitles': subtitles,
}) })
return info return info
class ITVBTCCIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?itv\.com/btcc/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://www.itv.com/btcc/races/btcc-2018-all-the-action-from-brands-hatch',
'info_dict': {
'id': 'btcc-2018-all-the-action-from-brands-hatch',
'title': 'BTCC 2018: All the action from Brands Hatch',
},
'playlist_mincount': 9,
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1582188683001/HkiHLnNRx_default/index.html?videoId=%s'
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = [
self.url_result(
smuggle_url(self.BRIGHTCOVE_URL_TEMPLATE % video_id, {
# ITV does not like some GB IP ranges, so here are some
# IP blocks it accepts
'geo_ip_blocks': [
'193.113.0.0/16', '54.36.162.0/23', '159.65.16.0/21'
],
'referrer': url,
}),
ie=BrightcoveNewIE.ie_key(), video_id=video_id)
for video_id in re.findall(r'data-video-id=["\'](\d+)', webpage)]
title = self._og_search_title(webpage, fatal=False)
return self.playlist_result(entries, playlist_id, title)

View File

@ -136,9 +136,10 @@ class KalturaIE(InfoExtractor):
re.search( re.search(
r'''(?xs) r'''(?xs)
<(?:iframe[^>]+src|meta[^>]+\bcontent)=(?P<q1>["']) <(?:iframe[^>]+src|meta[^>]+\bcontent)=(?P<q1>["'])
(?:https?:)?//(?:(?:www|cdnapi)\.)?kaltura\.com/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+) (?:https?:)?//(?:(?:www|cdnapi(?:sec)?)\.)?kaltura\.com/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+)
(?:(?!(?P=q1)).)* (?:(?!(?P=q1)).)*
[?&;]entry_id=(?P<id>(?:(?!(?P=q1))[^&])+) [?&;]entry_id=(?P<id>(?:(?!(?P=q1))[^&])+)
(?:(?!(?P=q1)).)*
(?P=q1) (?P=q1)
''', webpage) ''', webpage)
) )

View File

@ -282,7 +282,9 @@ class LimelightMediaIE(LimelightBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
video_id = self._match_id(url) video_id = self._match_id(url)
self._initialize_geo_bypass(smuggled_data.get('geo_countries')) self._initialize_geo_bypass({
'countries': smuggled_data.get('geo_countries'),
})
pc, mobile, metadata = self._extract( pc, mobile, metadata = self._extract(
video_id, 'getPlaylistByMediaId', video_id, 'getPlaylistByMediaId',

View File

@ -62,7 +62,7 @@ class TuneInBaseIE(InfoExtractor):
return { return {
'id': content_id, 'id': content_id,
'title': title, 'title': self._live_title(title) if is_live else title,
'formats': formats, 'formats': formats,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'location': location, 'location': location,

View File

@ -227,14 +227,16 @@ class TVPlayIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
self._initialize_geo_bypass(smuggled_data.get('geo_countries')) self._initialize_geo_bypass({
'countries': smuggled_data.get('geo_countries'),
})
video_id = self._match_id(url) video_id = self._match_id(url)
geo_country = self._search_regex( geo_country = self._search_regex(
r'https?://[^/]+\.([a-z]{2})', url, r'https?://[^/]+\.([a-z]{2})', url,
'geo country', default=None) 'geo country', default=None)
if geo_country: if geo_country:
self._initialize_geo_bypass([geo_country.upper()]) self._initialize_geo_bypass({'countries': [geo_country.upper()]})
video = self._download_json( video = self._download_json(
'http://playapi.mtgx.tv/v3/videos/%s' % video_id, video_id, 'Downloading video JSON') 'http://playapi.mtgx.tv/v3/videos/%s' % video_id, video_id, 'Downloading video JSON')

View File

@ -18,6 +18,7 @@ from ..utils import (
int_or_none, int_or_none,
js_to_json, js_to_json,
sanitized_Request, sanitized_Request,
try_get,
unescapeHTML, unescapeHTML,
urlencode_postdata, urlencode_postdata,
) )
@ -58,6 +59,10 @@ class UdemyIE(InfoExtractor):
# no url in outputs format entry # no url in outputs format entry
'url': 'https://www.udemy.com/learn-web-development-complete-step-by-step-guide-to-success/learn/v4/t/lecture/4125812', 'url': 'https://www.udemy.com/learn-web-development-complete-step-by-step-guide-to-success/learn/v4/t/lecture/4125812',
'only_matching': True, 'only_matching': True,
}, {
# only outputs rendition
'url': 'https://www.udemy.com/how-you-can-help-your-local-community-5-amazing-examples/learn/v4/t/lecture/3225750?start=0',
'only_matching': True,
}] }]
def _extract_course_info(self, webpage, video_id): def _extract_course_info(self, webpage, video_id):
@ -101,7 +106,7 @@ class UdemyIE(InfoExtractor):
% (course_id, lecture_id), % (course_id, lecture_id),
lecture_id, 'Downloading lecture JSON', query={ lecture_id, 'Downloading lecture JSON', query={
'fields[lecture]': 'title,description,view_html,asset', 'fields[lecture]': 'title,description,view_html,asset',
'fields[asset]': 'asset_type,stream_url,thumbnail_url,download_urls,data', 'fields[asset]': 'asset_type,stream_url,thumbnail_url,download_urls,stream_urls,captions,data',
}) })
def _handle_error(self, response): def _handle_error(self, response):
@ -299,9 +304,25 @@ class UdemyIE(InfoExtractor):
'url': src, 'url': src,
}) })
download_urls = asset.get('download_urls') for url_kind in ('download', 'stream'):
if isinstance(download_urls, dict): urls = asset.get('%s_urls' % url_kind)
extract_formats(download_urls.get('Video')) if isinstance(urls, dict):
extract_formats(urls.get('Video'))
captions = asset.get('captions')
if isinstance(captions, list):
for cc in captions:
if not isinstance(cc, dict):
continue
cc_url = cc.get('url')
if not cc_url or not isinstance(cc_url, compat_str):
continue
lang = try_get(cc, lambda x: x['locale']['locale'], compat_str)
sub_dict = (automatic_captions if cc.get('source') == 'auto'
else subtitles)
sub_dict.setdefault(lang or 'en', []).append({
'url': cc_url,
})
view_html = lecture.get('view_html') view_html = lecture.get('view_html')
if view_html: if view_html:
@ -357,6 +378,12 @@ class UdemyIE(InfoExtractor):
fatal=False) fatal=False)
extract_subtitles(text_tracks) extract_subtitles(text_tracks)
if not formats and outputs:
for format_id, output in outputs.items():
f = extract_output_format(output, format_id)
if f.get('url'):
formats.append(f)
self._sort_formats(formats, field_preference=('height', 'width', 'tbr', 'format_id')) self._sort_formats(formats, field_preference=('height', 'width', 'tbr', 'format_id'))
return { return {

View File

@ -69,7 +69,7 @@ class WatchBoxIE(InfoExtractor):
source = self._parse_json( source = self._parse_json(
self._search_regex( self._search_regex(
r'(?s)source\s*:\s*({.+?})\s*,\s*\n', webpage, 'source', r'(?s)source["\']?\s*:\s*({.+?})\s*[,}]', webpage, 'source',
default='{}'), default='{}'),
video_id, transform_source=js_to_json, fatal=False) or {} video_id, transform_source=js_to_json, fatal=False) or {}

View File

@ -0,0 +1,270 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from uuid import uuid4
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
)
from ..utils import (
ExtractorError,
int_or_none,
try_get,
urlencode_postdata,
)
class ZattooBaseIE(InfoExtractor):
_NETRC_MACHINE = 'zattoo'
_HOST_URL = 'https://zattoo.com'
_power_guide_hash = None
def _login(self):
(username, password) = self._get_login_info()
if not username or not password:
self.raise_login_required(
'A valid %s account is needed to access this media.'
% self._NETRC_MACHINE)
try:
data = self._download_json(
'%s/zapi/v2/account/login' % self._HOST_URL, None, 'Logging in',
data=urlencode_postdata({
'login': username,
'password': password,
'remember': 'true',
}), headers={
'Referer': '%s/login' % self._HOST_URL,
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
raise ExtractorError(
'Unable to login: incorrect username and/or password',
expected=True)
raise
self._power_guide_hash = data['session']['power_guide_hash']
def _real_initialize(self):
webpage = self._download_webpage(
self._HOST_URL, None, 'Downloading app token')
app_token = self._html_search_regex(
r'appToken\s*=\s*(["\'])(?P<token>(?:(?!\1).)+?)\1',
webpage, 'app token', group='token')
app_version = self._html_search_regex(
r'<!--\w+-(.+?)-', webpage, 'app version', default='2.8.2')
# Will setup appropriate cookies
self._request_webpage(
'%s/zapi/v2/session/hello' % self._HOST_URL, None,
'Opening session', data=urlencode_postdata({
'client_app_token': app_token,
'uuid': compat_str(uuid4()),
'lang': 'en',
'app_version': app_version,
'format': 'json',
}))
self._login()
def _extract_cid(self, video_id, channel_name):
channel_groups = self._download_json(
'%s/zapi/v2/cached/channels/%s' % (self._HOST_URL,
self._power_guide_hash),
video_id, 'Downloading channel list',
query={'details': False})['channel_groups']
channel_list = []
for chgrp in channel_groups:
channel_list.extend(chgrp['channels'])
try:
return next(
chan['cid'] for chan in channel_list
if chan.get('cid') and (
chan.get('display_alias') == channel_name or
chan.get('cid') == channel_name))
except StopIteration:
raise ExtractorError('Could not extract channel id')
def _extract_cid_and_video_info(self, video_id):
data = self._download_json(
'%s/zapi/program/details' % self._HOST_URL,
video_id,
'Downloading video information',
query={
'program_id': video_id,
'complete': True
})
p = data['program']
cid = p['cid']
info_dict = {
'id': video_id,
'title': p.get('title') or p['episode_title'],
'description': p.get('description'),
'thumbnail': p.get('image_url'),
'creator': p.get('channel_name'),
'episode': p.get('episode_title'),
'episode_number': int_or_none(p.get('episode_number')),
'season_number': int_or_none(p.get('season_number')),
'release_year': int_or_none(p.get('year')),
'categories': try_get(p, lambda x: x['categories'], list),
}
return cid, info_dict
def _extract_formats(self, cid, video_id, record_id=None, is_live=False):
postdata_common = {
'https_watch_urls': True,
}
if is_live:
postdata_common.update({'timeshift': 10800})
url = '%s/zapi/watch/live/%s' % (self._HOST_URL, cid)
elif record_id:
url = '%s/zapi/watch/recording/%s' % (self._HOST_URL, record_id)
else:
url = '%s/zapi/watch/recall/%s/%s' % (self._HOST_URL, cid, video_id)
formats = []
for stream_type in ('dash', 'hls', 'hls5', 'hds'):
postdata = postdata_common.copy()
postdata['stream_type'] = stream_type
data = self._download_json(
url, video_id, 'Downloading %s formats' % stream_type.upper(),
data=urlencode_postdata(postdata), fatal=False)
if not data:
continue
watch_urls = try_get(
data, lambda x: x['stream']['watch_urls'], list)
if not watch_urls:
continue
for watch in watch_urls:
if not isinstance(watch, dict):
continue
watch_url = watch.get('url')
if not watch_url or not isinstance(watch_url, compat_str):
continue
format_id_list = [stream_type]
maxrate = watch.get('maxrate')
if maxrate:
format_id_list.append(compat_str(maxrate))
audio_channel = watch.get('audio_channel')
if audio_channel:
format_id_list.append(compat_str(audio_channel))
preference = 1 if audio_channel == 'A' else None
format_id = '-'.join(format_id_list)
if stream_type in ('dash', 'dash_widevine', 'dash_playready'):
this_formats = self._extract_mpd_formats(
watch_url, video_id, mpd_id=format_id, fatal=False)
elif stream_type in ('hls', 'hls5', 'hls5_fairplay'):
this_formats = self._extract_m3u8_formats(
watch_url, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id=format_id,
fatal=False)
elif stream_type == 'hds':
this_formats = self._extract_f4m_formats(
watch_url, video_id, f4m_id=format_id, fatal=False)
elif stream_type == 'smooth_playready':
this_formats = self._extract_ism_formats(
watch_url, video_id, ism_id=format_id, fatal=False)
else:
assert False
for this_format in this_formats:
this_format['preference'] = preference
formats.extend(this_formats)
self._sort_formats(formats)
return formats
def _extract_video(self, channel_name, video_id, record_id=None, is_live=False):
if is_live:
cid = self._extract_cid(video_id, channel_name)
info_dict = {
'id': channel_name,
'title': self._live_title(channel_name),
'is_live': True,
}
else:
cid, info_dict = self._extract_cid_and_video_info(video_id)
formats = self._extract_formats(
cid, video_id, record_id=record_id, is_live=is_live)
info_dict['formats'] = formats
return info_dict
class QuicklineBaseIE(ZattooBaseIE):
_NETRC_MACHINE = 'quickline'
_HOST_URL = 'https://mobiltv.quickline.com'
class QuicklineIE(QuicklineBaseIE):
_VALID_URL = r'https?://(?:www\.)?mobiltv\.quickline\.com/watch/(?P<channel>[^/]+)/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://mobiltv.quickline.com/watch/prosieben/130671867-maze-runner-die-auserwaehlten-in-der-brandwueste',
'only_matching': True,
}
def _real_extract(self, url):
channel_name, video_id = re.match(self._VALID_URL, url).groups()
return self._extract_video(channel_name, video_id)
class QuicklineLiveIE(QuicklineBaseIE):
_VALID_URL = r'https?://(?:www\.)?mobiltv\.quickline\.com/watch/(?P<id>[^/]+)'
_TEST = {
'url': 'https://mobiltv.quickline.com/watch/srf1',
'only_matching': True,
}
@classmethod
def suitable(cls, url):
return False if QuicklineIE.suitable(url) else super(QuicklineLiveIE, cls).suitable(url)
def _real_extract(self, url):
channel_name = video_id = self._match_id(url)
return self._extract_video(channel_name, video_id, is_live=True)
class ZattooIE(ZattooBaseIE):
_VALID_URL = r'https?://(?:www\.)?zattoo\.com/watch/(?P<channel>[^/]+?)/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?'
# Since regular videos are only available for 7 days and recorded videos
# are only available for a specific user, we cannot have detailed tests.
_TESTS = [{
'url': 'https://zattoo.com/watch/prosieben/130671867-maze-runner-die-auserwaehlten-in-der-brandwueste',
'only_matching': True,
}, {
'url': 'https://zattoo.com/watch/srf_zwei/132905652-eishockey-spengler-cup/102791477/1512211800000/1514433500000/92000',
'only_matching': True,
}]
def _real_extract(self, url):
channel_name, video_id, record_id = re.match(self._VALID_URL, url).groups()
return self._extract_video(channel_name, video_id, record_id)
class ZattooLiveIE(ZattooBaseIE):
_VALID_URL = r'https?://(?:www\.)?zattoo\.com/watch/(?P<id>[^/]+)'
_TEST = {
'url': 'https://zattoo.com/watch/srf1',
'only_matching': True,
}
@classmethod
def suitable(cls, url):
return False if ZattooIE.suitable(url) else super(ZattooLiveIE, cls).suitable(url)
def _real_extract(self, url):
channel_name = video_id = self._match_id(url)
return self._extract_video(channel_name, video_id, is_live=True)

View File

@ -249,6 +249,10 @@ def parseOpts(overrideArguments=None):
'--geo-bypass-country', metavar='CODE', '--geo-bypass-country', metavar='CODE',
dest='geo_bypass_country', default=None, dest='geo_bypass_country', default=None,
help='Force bypass geographic restriction with explicitly provided two-letter ISO 3166-2 country code (experimental)') help='Force bypass geographic restriction with explicitly provided two-letter ISO 3166-2 country code (experimental)')
geo.add_option(
'--geo-bypass-ip-block', metavar='IP_BLOCK',
dest='geo_bypass_ip_block', default=None,
help='Force bypass geographic restriction with explicitly provided IP block in CIDR notation (experimental)')
selection = optparse.OptionGroup(parser, 'Video Selection') selection = optparse.OptionGroup(parser, 'Video Selection')
selection.add_option( selection.add_option(

View File

@ -3534,10 +3534,13 @@ class GeoUtils(object):
} }
@classmethod @classmethod
def random_ipv4(cls, code): def random_ipv4(cls, code_or_block):
block = cls._country_ip_map.get(code.upper()) if len(code_or_block) == 2:
if not block: block = cls._country_ip_map.get(code_or_block.upper())
return None if not block:
return None
else:
block = code_or_block
addr, preflen = block.split('/') addr, preflen = block.split('/')
addr_min = compat_struct_unpack('!L', socket.inet_aton(addr))[0] addr_min = compat_struct_unpack('!L', socket.inet_aton(addr))[0]
addr_max = addr_min | (0xffffffff >> int(preflen)) addr_max = addr_min | (0xffffffff >> int(preflen))

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2018.04.25' __version__ = '2018.05.09'