Merge branch 'master' into PornHub-issue-16078

This commit is contained in:
Parmjit Virk 2018-10-21 20:04:33 -05:00
commit 5debd37c30
25 changed files with 636 additions and 313 deletions

View File

@ -6,8 +6,8 @@
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.09.18*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.10.05*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.09.18** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.10.05**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -36,7 +36,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2018.09.18 [debug] youtube-dl version 2018.10.05
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@ -1,3 +1,29 @@
version 2018.10.05
Extractors
* [pluralsight] Improve authentication (#17762)
* [dailymotion] Fix extraction (#17699)
* [crunchyroll] Switch to HTTPS for RpcApi (#17749)
+ [philharmoniedeparis] Add support for pad.philharmoniedeparis.fr (#17705)
* [philharmoniedeparis] Fix extraction (#17705)
+ [jamendo] Add support for licensing.jamendo.com (#17724)
+ [openload] Add support for oload.cloud (#17710)
* [pluralsight] Fix subtitles extraction (#17726, #17728)
+ [vimeo] Add another config regular expression (#17690)
* [spike] Fix Paramount Network extraction (#17677)
* [hotstar] Fix extraction (#14694, #14931, #17637)
version 2018.09.26
Extractors
* [pluralsight] Fix subtitles extraction (#17671)
* [mediaset] Improve embed support (#17668)
+ [youtube] Add support for invidio.us (#17613)
+ [zattoo] Add support for more zattoo platform sites
* [zattoo] Fix extraction (#17175, #17542)
version 2018.09.18 version 2018.09.18
Core Core

View File

@ -511,6 +511,8 @@ The basic usage is not to set any template arguments when downloading a single f
- `timestamp` (numeric): UNIX timestamp of the moment the video became available - `timestamp` (numeric): UNIX timestamp of the moment the video became available
- `upload_date` (string): Video upload date (YYYYMMDD) - `upload_date` (string): Video upload date (YYYYMMDD)
- `uploader_id` (string): Nickname or id of the video uploader - `uploader_id` (string): Nickname or id of the video uploader
- `channel` (string): Full name of the channel the video is uploaded on
- `channel_id` (string): Id of the channel
- `location` (string): Physical location where the video was filmed - `location` (string): Physical location where the video was filmed
- `duration` (numeric): Length of the video in seconds - `duration` (numeric): Length of the video in seconds
- `view_count` (numeric): How many users have watched the video on the platform - `view_count` (numeric): How many users have watched the video on the platform

View File

@ -98,6 +98,7 @@
- **bbc.co.uk:article**: BBC articles - **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:playlist** - **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:playlist** - **bbc.co.uk:playlist**
- **BBVTV**
- **Beatport** - **Beatport**
- **Beeg** - **Beeg**
- **BehindKink** - **BehindKink**
@ -251,6 +252,7 @@
- **egghead:course**: egghead.io course - **egghead:course**: egghead.io course
- **egghead:lesson**: egghead.io lesson - **egghead:lesson**: egghead.io lesson
- **eHow** - **eHow**
- **EinsUndEinsTV**
- **Einthusan** - **Einthusan**
- **eitb.tv** - **eitb.tv**
- **EllenTube** - **EllenTube**
@ -268,6 +270,7 @@
- **EsriVideo** - **EsriVideo**
- **Europa** - **Europa**
- **EveryonesMixtape** - **EveryonesMixtape**
- **EWETV**
- **ExpoTV** - **ExpoTV**
- **Expressen** - **Expressen**
- **ExtremeTube** - **ExtremeTube**
@ -327,6 +330,7 @@
- **Gfycat** - **Gfycat**
- **GiantBomb** - **GiantBomb**
- **Giga** - **Giga**
- **GlattvisionTV**
- **Glide**: Glide mobile video messages (glide.me) - **Glide**: Glide mobile video messages (glide.me)
- **Globo** - **Globo**
- **GloboArticle** - **GloboArticle**
@ -356,7 +360,7 @@
- **HitRecord** - **HitRecord**
- **HornBunny** - **HornBunny**
- **HotNewHipHop** - **HotNewHipHop**
- **HotStar** - **hotstar**
- **hotstar:playlist** - **hotstar:playlist**
- **Howcast** - **Howcast**
- **HowStuffWorks** - **HowStuffWorks**
@ -494,6 +498,7 @@
- **Mixer:vod** - **Mixer:vod**
- **MLB** - **MLB**
- **Mnet** - **Mnet**
- **MNetTV**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex** - **Mofosex**
- **Mojvideo** - **Mojvideo**
@ -525,6 +530,7 @@
- **Myvi** - **Myvi**
- **MyVidster** - **MyVidster**
- **MyviEmbed** - **MyviEmbed**
- **MyVisionTV**
- **n-tv.de** - **n-tv.de**
- **natgeo** - **natgeo**
- **natgeo:episodeguide** - **natgeo:episodeguide**
@ -550,6 +556,7 @@
- **netease:program**: 网易云音乐 - 电台节目 - **netease:program**: 网易云音乐 - 电台节目
- **netease:singer**: 网易云音乐 - 歌手 - **netease:singer**: 网易云音乐 - 歌手
- **netease:song**: 网易云音乐 - **netease:song**: 网易云音乐
- **NetPlus**
- **Netzkino** - **Netzkino**
- **Newgrounds** - **Newgrounds**
- **NewgroundsPlaylist** - **NewgroundsPlaylist**
@ -626,6 +633,7 @@
- **orf:iptv**: iptv.ORF.at - **orf:iptv**: iptv.ORF.at
- **orf:oe1**: Radio Österreich 1 - **orf:oe1**: Radio Österreich 1
- **orf:tvthek**: ORF TVthek - **orf:tvthek**: ORF TVthek
- **OsnatelTV**
- **PacktPub** - **PacktPub**
- **PacktPubCourse** - **PacktPubCourse**
- **PandaTV**: 熊猫TV - **PandaTV**: 熊猫TV
@ -686,6 +694,7 @@
- **qqmusic:playlist**: QQ音乐 - 歌单 - **qqmusic:playlist**: QQ音乐 - 歌单
- **qqmusic:singer**: QQ音乐 - 歌手 - **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜 - **qqmusic:toplist**: QQ音乐 - 排行榜
- **QuantumTV**
- **Quickline** - **Quickline**
- **QuicklineLive** - **QuicklineLive**
- **R7** - **R7**
@ -753,6 +762,7 @@
- **safari**: safaribooksonline.com online video - **safari**: safaribooksonline.com online video
- **safari:api** - **safari:api**
- **safari:course**: safaribooksonline.com online courses - **safari:course**: safaribooksonline.com online courses
- **SAKTV**
- **Sapo**: SAPO Vídeos - **Sapo**: SAPO Vídeos
- **savefrom.net** - **savefrom.net**
- **SBS**: sbs.com.au - **SBS**: sbs.com.au
@ -1035,12 +1045,14 @@
- **vrv** - **vrv**
- **vrv:series** - **vrv:series**
- **VShare** - **VShare**
- **VTXTV**
- **vube**: Vube.com - **vube**: Vube.com
- **VuClip** - **VuClip**
- **VVVVID** - **VVVVID**
- **VyboryMos** - **VyboryMos**
- **Vzaar** - **Vzaar**
- **Walla** - **Walla**
- **WalyTV**
- **washingtonpost** - **washingtonpost**
- **washingtonpost:article** - **washingtonpost:article**
- **wat.tv** - **wat.tv**

View File

@ -1,8 +1,10 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re import base64
import json import json
import re
import struct
from .common import InfoExtractor from .common import InfoExtractor
from .adobepass import AdobePassIE from .adobepass import AdobePassIE
@ -310,6 +312,10 @@ class BrightcoveLegacyIE(InfoExtractor):
'Cannot find playerKey= variable. Did you forget quotes in a shell invocation?', 'Cannot find playerKey= variable. Did you forget quotes in a shell invocation?',
expected=True) expected=True)
def _brightcove_new_url_result(self, publisher_id, video_id):
brightcove_new_url = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id)
return self.url_result(brightcove_new_url, BrightcoveNewIE.ie_key(), video_id)
def _get_video_info(self, video_id, query, referer=None): def _get_video_info(self, video_id, query, referer=None):
headers = {} headers = {}
linkBase = query.get('linkBaseURL') linkBase = query.get('linkBaseURL')
@ -323,6 +329,28 @@ class BrightcoveLegacyIE(InfoExtractor):
r"<h1>We're sorry.</h1>([\s\n]*<p>.*?</p>)+", webpage, r"<h1>We're sorry.</h1>([\s\n]*<p>.*?</p>)+", webpage,
'error message', default=None) 'error message', default=None)
if error_msg is not None: if error_msg is not None:
publisher_id = query.get('publisherId')
if publisher_id and publisher_id[0].isdigit():
publisher_id = publisher_id[0]
if not publisher_id:
player_key = query.get('playerKey')
if player_key and ',' in player_key[0]:
player_key = player_key[0]
else:
player_id = query.get('playerID')
if player_id and player_id[0].isdigit():
player_page = self._download_webpage(
'http://link.brightcove.com/services/player/bcpid' + player_id[0],
video_id, headers=headers, fatal=False)
if player_page:
player_key = self._search_regex(
r'<param\s+name="playerKey"\s+value="([\w~,-]+)"',
player_page, 'player key', fatal=False)
if player_key:
enc_pub_id = player_key.split(',')[1].replace('~', '=')
publisher_id = struct.unpack('>Q', base64.urlsafe_b64decode(enc_pub_id))[0]
if publisher_id:
return self._brightcove_new_url_result(publisher_id, video_id)
raise ExtractorError( raise ExtractorError(
'brightcove said: %s' % error_msg, expected=True) 'brightcove said: %s' % error_msg, expected=True)
@ -444,8 +472,12 @@ class BrightcoveLegacyIE(InfoExtractor):
else: else:
return ad_info return ad_info
if 'url' not in info and not info.get('formats'): if not info.get('url') and not info.get('formats'):
raise ExtractorError('Unable to extract video url for %s' % video_id) uploader_id = info.get('uploader_id')
if uploader_id:
info.update(self._brightcove_new_url_result(uploader_id, video_id))
else:
raise ExtractorError('Unable to extract video url for %s' % video_id)
return info return info

View File

@ -45,7 +45,7 @@ class CrunchyrollBaseIE(InfoExtractor):
data['req'] = 'RpcApi' + method data['req'] = 'RpcApi' + method
data = compat_urllib_parse_urlencode(data).encode('utf-8') data = compat_urllib_parse_urlencode(data).encode('utf-8')
return self._download_xml( return self._download_xml(
'http://www.crunchyroll.com/xml/', 'https://www.crunchyroll.com/xml/',
video_id, note, fatal=False, data=data, headers={ video_id, note, fatal=False, data=data, headers={
'Content-Type': 'application/x-www-form-urlencoded', 'Content-Type': 'application/x-www-form-urlencoded',
}) })

View File

@ -3,6 +3,7 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError,
int_or_none, int_or_none,
parse_age_limit, parse_age_limit,
parse_iso8601, parse_iso8601,
@ -66,9 +67,12 @@ class CWTVIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
video_data = self._download_json( data = self._download_json(
'http://images.cwtv.com/feed/mobileapp/video-meta/apiversion_8/guid_' + video_id, 'http://images.cwtv.com/feed/mobileapp/video-meta/apiversion_8/guid_' + video_id,
video_id)['video'] video_id)
if data.get('result') != 'ok':
raise ExtractorError(data['msg'], expected=True)
video_data = data['video']
title = video_data['title'] title = video_data['title']
mpx_url = video_data.get('mpx_url') or 'http://link.theplatform.com/s/cwtv/media/guid/2703454149/%s?formats=M3U' % video_id mpx_url = video_data.get('mpx_url') or 'http://link.theplatform.com/s/cwtv/media/guid/2703454149/%s?formats=M3U' % video_id

View File

@ -22,7 +22,10 @@ from ..utils import (
parse_iso8601, parse_iso8601,
sanitized_Request, sanitized_Request,
str_to_int, str_to_int,
try_get,
unescapeHTML, unescapeHTML,
update_url_query,
url_or_none,
urlencode_postdata, urlencode_postdata,
) )
@ -171,10 +174,25 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
r'__PLAYER_CONFIG__\s*=\s*({.+?});'], r'__PLAYER_CONFIG__\s*=\s*({.+?});'],
webpage, 'player v5', default=None) webpage, 'player v5', default=None)
if player_v5: if player_v5:
player = self._parse_json(player_v5, video_id) player = self._parse_json(player_v5, video_id, fatal=False) or {}
metadata = player['metadata'] metadata = try_get(player, lambda x: x['metadata'], dict)
if not metadata:
metadata_url = url_or_none(try_get(
player, lambda x: x['context']['metadata_template_url1']))
if metadata_url:
metadata_url = metadata_url.replace(':videoId', video_id)
else:
metadata_url = update_url_query(
'https://www.dailymotion.com/player/metadata/video/%s'
% video_id, {
'embedder': url,
'integration': 'inline',
'GK_PV5_NEON': '1',
})
metadata = self._download_json(
metadata_url, video_id, 'Downloading metadata JSON')
if metadata.get('error', {}).get('type') == 'password_protected': if try_get(metadata, lambda x: x['error']['type']) == 'password_protected':
password = self._downloader.params.get('videopassword') password = self._downloader.params.get('videopassword')
if password: if password:
r = int(metadata['id'][1:], 36) r = int(metadata['id'][1:], 36)

View File

@ -1153,7 +1153,6 @@ from .tv2 import (
TV2ArticleIE, TV2ArticleIE,
) )
from .tv2hu import TV2HuIE from .tv2hu import TV2HuIE
from .tv3 import TV3IE
from .tv4 import TV4IE from .tv4 import TV4IE
from .tv5mondeplus import TV5MondePlusIE from .tv5mondeplus import TV5MondePlusIE
from .tva import TVAIE from .tva import TVAIE
@ -1455,8 +1454,20 @@ from .youtube import (
from .zapiks import ZapiksIE from .zapiks import ZapiksIE
from .zaq1 import Zaq1IE from .zaq1 import Zaq1IE
from .zattoo import ( from .zattoo import (
BBVTVIE,
EinsUndEinsTVIE,
EWETVIE,
GlattvisionTVIE,
MNetTVIE,
MyVisionTVIE,
NetPlusIE,
OsnatelTVIE,
QuantumTVIE,
QuicklineIE, QuicklineIE,
QuicklineLiveIE, QuicklineLiveIE,
SAKTVIE,
VTXTVIE,
WalyTVIE,
ZattooIE, ZattooIE,
ZattooLiveIE, ZattooLiveIE,
) )

View File

@ -3023,7 +3023,7 @@ class GenericIE(InfoExtractor):
wapo_urls, video_id, video_title, ie=WashingtonPostIE.ie_key()) wapo_urls, video_id, video_title, ie=WashingtonPostIE.ie_key())
# Look for Mediaset embeds # Look for Mediaset embeds
mediaset_urls = MediasetIE._extract_urls(webpage) mediaset_urls = MediasetIE._extract_urls(self, webpage)
if mediaset_urls: if mediaset_urls:
return self.playlist_from_matches( return self.playlist_from_matches(
mediaset_urls, video_id, video_title, ie=MediasetIE.ie_key()) mediaset_urls, video_id, video_title, ie=MediasetIE.ie_key())

View File

@ -1,49 +1,55 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re import hashlib
import hmac
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_HTTPError
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
try_get,
) )
class HotStarBaseIE(InfoExtractor): class HotStarBaseIE(InfoExtractor):
_GEO_COUNTRIES = ['IN'] _AKAMAI_ENCRYPTION_KEY = b'\x05\xfc\x1a\x01\xca\xc9\x4b\xc4\x12\xfc\x53\x12\x07\x75\xf9\xee'
def _download_json(self, *args, **kwargs): def _call_api(self, path, video_id, query_name='contentId'):
response = super(HotStarBaseIE, self)._download_json(*args, **kwargs) st = int(time.time())
if response['resultCode'] != 'OK': exp = st + 6000
if kwargs.get('fatal'): auth = 'st=%d~exp=%d~acl=/*' % (st, exp)
raise ExtractorError( auth += '~hmac=' + hmac.new(self._AKAMAI_ENCRYPTION_KEY, auth.encode(), hashlib.sha256).hexdigest()
response['errorDescription'], expected=True) response = self._download_json(
return None 'https://api.hotstar.com/' + path,
return response['resultObj'] video_id, headers={
'hotstarauth': auth,
def _download_content_info(self, content_id): 'x-country-code': 'IN',
return self._download_json( 'x-platform-code': 'JIO',
'https://account.hotstar.com/AVS/besc', content_id, query={ }, query={
'action': 'GetAggregatedContentDetails', query_name: video_id,
'appVersion': '5.0.40', 'tas': 10000,
'channel': 'PCTV', })
'contentId': content_id, if response['statusCode'] != 'OK':
})['contentInfo'][0] raise ExtractorError(
response['body']['message'], expected=True)
return response['body']['results']
class HotStarIE(HotStarBaseIE): class HotStarIE(HotStarBaseIE):
IE_NAME = 'hotstar'
_VALID_URL = r'https?://(?:www\.)?hotstar\.com/(?:.+?[/-])?(?P<id>\d{10})' _VALID_URL = r'https?://(?:www\.)?hotstar\.com/(?:.+?[/-])?(?P<id>\d{10})'
_TESTS = [{ _TESTS = [{
'url': 'http://www.hotstar.com/on-air-with-aib--english-1000076273', 'url': 'https://www.hotstar.com/can-you-not-spread-rumours/1000076273',
'info_dict': { 'info_dict': {
'id': '1000076273', 'id': '1000076273',
'ext': 'mp4', 'ext': 'mp4',
'title': 'On Air With AIB', 'title': 'Can You Not Spread Rumours?',
'description': 'md5:c957d8868e9bc793ccb813691cc4c434', 'description': 'md5:c957d8868e9bc793ccb813691cc4c434',
'timestamp': 1447227000, 'timestamp': 1447248600,
'upload_date': '20151111', 'upload_date': '20151111',
'duration': 381, 'duration': 381,
}, },
@ -58,47 +64,47 @@ class HotStarIE(HotStarBaseIE):
'url': 'http://www.hotstar.com/1000000515', 'url': 'http://www.hotstar.com/1000000515',
'only_matching': True, 'only_matching': True,
}] }]
_GEO_BYPASS = False
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
video_data = self._download_content_info(video_id) webpage = self._download_webpage(url, video_id)
app_state = self._parse_json(self._search_regex(
r'<script>window\.APP_STATE\s*=\s*({.+?})</script>',
webpage, 'app state'), video_id)
video_data = {}
for v in app_state.values():
content = try_get(v, lambda x: x['initialState']['contentData']['content'], dict)
if content and content.get('contentId') == video_id:
video_data = content
title = video_data['episodeTitle'] title = video_data['title']
if video_data.get('encrypted') == 'Y': if video_data.get('drmProtected'):
raise ExtractorError('This video is DRM protected.', expected=True) raise ExtractorError('This video is DRM protected.', expected=True)
formats = [] formats = []
for f in ('JIO',): format_data = self._call_api('h/v1/play', video_id)['item']
format_data = self._download_json( format_url = format_data['playbackUrl']
'http://getcdn.hotstar.com/AVS/besc', ext = determine_ext(format_url)
video_id, 'Downloading %s JSON metadata' % f, if ext == 'm3u8':
fatal=False, query={ try:
'action': 'GetCDN', formats.extend(self._extract_m3u8_formats(
'asJson': 'Y', format_url, video_id, 'mp4', m3u8_id='hls'))
'channel': f, except ExtractorError as e:
'id': video_id, if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
'type': 'VOD', self.raise_geo_restricted(countries=['IN'])
}) raise
if format_data: elif ext == 'f4m':
format_url = format_data.get('src') # produce broken files
if not format_url: pass
continue else:
ext = determine_ext(format_url) formats.append({
if ext == 'm3u8': 'url': format_url,
formats.extend(self._extract_m3u8_formats( 'width': int_or_none(format_data.get('width')),
format_url, video_id, 'mp4', 'height': int_or_none(format_data.get('height')),
m3u8_id='hls', fatal=False)) })
elif ext == 'f4m':
# produce broken files
continue
else:
formats.append({
'url': format_url,
'width': int_or_none(format_data.get('width')),
'height': int_or_none(format_data.get('height')),
})
self._sort_formats(formats) self._sort_formats(formats)
return { return {
@ -106,57 +112,43 @@ class HotStarIE(HotStarBaseIE):
'title': title, 'title': title,
'description': video_data.get('description'), 'description': video_data.get('description'),
'duration': int_or_none(video_data.get('duration')), 'duration': int_or_none(video_data.get('duration')),
'timestamp': int_or_none(video_data.get('broadcastDate')), 'timestamp': int_or_none(video_data.get('broadcastDate') or video_data.get('startDate')),
'formats': formats, 'formats': formats,
'channel': video_data.get('channelName'),
'channel_id': video_data.get('channelId'),
'series': video_data.get('showName'),
'season': video_data.get('seasonName'),
'season_number': int_or_none(video_data.get('seasonNo')),
'season_id': video_data.get('seasonId'),
'episode': title, 'episode': title,
'episode_number': int_or_none(video_data.get('episodeNumber')), 'episode_number': int_or_none(video_data.get('episodeNo')),
'series': video_data.get('contentTitle'),
} }
class HotStarPlaylistIE(HotStarBaseIE): class HotStarPlaylistIE(HotStarBaseIE):
IE_NAME = 'hotstar:playlist' IE_NAME = 'hotstar:playlist'
_VALID_URL = r'(?P<url>https?://(?:www\.)?hotstar\.com/tv/[^/]+/(?P<content_id>\d+))/(?P<type>[^/]+)/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?hotstar\.com/tv/[^/]+/s-\w+/list/[^/]+/t-(?P<id>\w+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.hotstar.com/tv/pratidaan/14982/episodes/14812/9993', 'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/popular-clips/t-3_2_26',
'info_dict': { 'info_dict': {
'id': '14812', 'id': '3_2_26',
}, },
'playlist_mincount': 75, 'playlist_mincount': 20,
}, { }, {
'url': 'http://www.hotstar.com/tv/pratidaan/14982/popular-clips/9998/9998', 'url': 'https://www.hotstar.com/tv/savdhaan-india/s-26/list/extras/t-2480',
'only_matching': True, 'only_matching': True,
}] }]
_ITEM_TYPES = {
'episodes': 'EPISODE',
'popular-clips': 'CLIPS',
}
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) playlist_id = self._match_id(url)
base_url = mobj.group('url')
content_id = mobj.group('content_id')
playlist_type = mobj.group('type')
content_info = self._download_content_info(content_id) collection = self._call_api('o/v1/tray/find', playlist_id, 'uqId')
playlist_id = compat_str(content_info['categoryId'])
collection = self._download_json(
'https://search.hotstar.com/AVS/besc', playlist_id, query={
'action': 'SearchContents',
'appVersion': '5.0.40',
'channel': 'PCTV',
'moreFilters': 'series:%s;' % playlist_id,
'query': '*',
'searchOrder': 'last_broadcast_date desc,year desc,title asc',
'type': self._ITEM_TYPES.get(playlist_type, 'EPISODE'),
})
entries = [ entries = [
self.url_result( self.url_result(
'%s/_/%s' % (base_url, video['contentId']), 'https://www.hotstar.com/%s' % video['contentId'],
ie=HotStarIE.ie_key(), video_id=video['contentId']) ie=HotStarIE.ie_key(), video_id=video['contentId'])
for video in collection['response']['docs'] for video in collection['assets']['items']
if video.get('contentId')] if video.get('contentId')]
return self.playlist_result(entries, playlist_id) return self.playlist_result(entries, playlist_id)

View File

@ -26,8 +26,15 @@ class JamendoBaseIE(InfoExtractor):
class JamendoIE(JamendoBaseIE): class JamendoIE(JamendoBaseIE):
_VALID_URL = r'https?://(?:www\.)?jamendo\.com/track/(?P<id>[0-9]+)/(?P<display_id>[^/?#&]+)' _VALID_URL = r'''(?x)
_TEST = { https?://
(?:
licensing\.jamendo\.com/[^/]+|
(?:www\.)?jamendo\.com
)
/track/(?P<id>[0-9]+)/(?P<display_id>[^/?#&]+)
'''
_TESTS = [{
'url': 'https://www.jamendo.com/track/196219/stories-from-emona-i', 'url': 'https://www.jamendo.com/track/196219/stories-from-emona-i',
'md5': '6e9e82ed6db98678f171c25a8ed09ffd', 'md5': '6e9e82ed6db98678f171c25a8ed09ffd',
'info_dict': { 'info_dict': {
@ -40,14 +47,19 @@ class JamendoIE(JamendoBaseIE):
'duration': 210, 'duration': 210,
'thumbnail': r're:^https?://.*\.jpg' 'thumbnail': r're:^https?://.*\.jpg'
} }
} }, {
'url': 'https://licensing.jamendo.com/en/track/1496667/energetic-rock',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
mobj = self._VALID_URL_RE.match(url) mobj = self._VALID_URL_RE.match(url)
track_id = mobj.group('id') track_id = mobj.group('id')
display_id = mobj.group('display_id') display_id = mobj.group('display_id')
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(
'https://www.jamendo.com/track/%s/%s' % (track_id, display_id),
display_id)
title, artist, track = self._extract_meta(webpage) title, artist, track = self._extract_meta(webpage)

View File

@ -4,6 +4,11 @@ from __future__ import unicode_literals
import re import re
from .theplatform import ThePlatformBaseIE from .theplatform import ThePlatformBaseIE
from ..compat import (
compat_parse_qs,
compat_str,
compat_urllib_parse_urlparse,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
@ -76,12 +81,33 @@ class MediasetIE(ThePlatformBaseIE):
}] }]
@staticmethod @staticmethod
def _extract_urls(webpage): def _extract_urls(ie, webpage):
return [ def _qs(url):
mobj.group('url') return compat_parse_qs(compat_urllib_parse_urlparse(url).query)
for mobj in re.finditer(
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>https?://(?:www\.)?video\.mediaset\.it/player/playerIFrame(?:Twitter)?\.shtml\?.*?\bid=\d+.*?)\1', def _program_guid(qs):
webpage)] return qs.get('programGuid', [None])[0]
entries = []
for mobj in re.finditer(
r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//(?:www\.)?video\.mediaset\.it/player/playerIFrame(?:Twitter)?\.shtml.*?)\1',
webpage):
embed_url = mobj.group('url')
embed_qs = _qs(embed_url)
program_guid = _program_guid(embed_qs)
if program_guid:
entries.append(embed_url)
continue
video_id = embed_qs.get('id', [None])[0]
if not video_id:
continue
urlh = ie._request_webpage(
embed_url, video_id, note='Following embed URL redirect')
embed_url = compat_str(urlh.geturl())
program_guid = _program_guid(_qs(embed_url))
if program_guid:
entries.append(embed_url)
return entries
def _real_extract(self, url): def _real_extract(self, url):
guid = self._match_id(url) guid = self._match_id(url)

View File

@ -243,7 +243,7 @@ class PhantomJSwrapper(object):
class OpenloadIE(InfoExtractor): class OpenloadIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:openload\.(?:co|io|link)|oload\.(?:tv|stream|site|xyz|win|download))/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)' _VALID_URL = r'https?://(?:www\.)?(?:openload\.(?:co|io|link)|oload\.(?:tv|stream|site|xyz|win|download|cloud|cc))/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://openload.co/f/kUEfGclsU9o', 'url': 'https://openload.co/f/kUEfGclsU9o',
@ -307,10 +307,16 @@ class OpenloadIE(InfoExtractor):
}, { }, {
'url': 'https://oload.download/f/kUEfGclsU9o', 'url': 'https://oload.download/f/kUEfGclsU9o',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://oload.cloud/f/4ZDnBXRWiB8',
'only_matching': True,
}, { }, {
# Its title has not got its extension but url has it # Its title has not got its extension but url has it
'url': 'https://oload.download/f/N4Otkw39VCw/Tomb.Raider.2018.HDRip.XviD.AC3-EVO.avi.mp4', 'url': 'https://oload.download/f/N4Otkw39VCw/Tomb.Raider.2018.HDRip.XviD.AC3-EVO.avi.mp4',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://oload.cc/embed/5NEAbI2BDSk',
'only_matching': True,
}] }]
_USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36' _USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'

View File

@ -2,52 +2,63 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import js_to_json from ..utils import (
clean_html,
determine_ext,
int_or_none,
parse_iso8601,
)
class PatreonIE(InfoExtractor): class PatreonIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?patreon\.com/creation\?hid=(?P<id>[^&#]+)' _VALID_URL = r'https?://(?:www\.)?patreon\.com/(?:creation\?hid=|posts/(?:[\w-]+-)?)(?P<id>\d+)'
_TESTS = [ _TESTS = [{
{ 'url': 'http://www.patreon.com/creation?hid=743933',
'url': 'http://www.patreon.com/creation?hid=743933', 'md5': 'e25505eec1053a6e6813b8ed369875cc',
'md5': 'e25505eec1053a6e6813b8ed369875cc', 'info_dict': {
'info_dict': { 'id': '743933',
'id': '743933', 'ext': 'mp3',
'ext': 'mp3', 'title': 'Episode 166: David Smalley of Dogma Debate',
'title': 'Episode 166: David Smalley of Dogma Debate', 'description': 'md5:713b08b772cd6271b9f3906683cfacdf',
'uploader': 'Cognitive Dissonance Podcast', 'uploader': 'Cognitive Dissonance Podcast',
'thumbnail': 're:^https?://.*$', 'thumbnail': 're:^https?://.*$',
}, 'timestamp': 1406473987,
'upload_date': '20140727',
}, },
{ }, {
'url': 'http://www.patreon.com/creation?hid=754133', 'url': 'http://www.patreon.com/creation?hid=754133',
'md5': '3eb09345bf44bf60451b8b0b81759d0a', 'md5': '3eb09345bf44bf60451b8b0b81759d0a',
'info_dict': { 'info_dict': {
'id': '754133', 'id': '754133',
'ext': 'mp3', 'ext': 'mp3',
'title': 'CD 167 Extra', 'title': 'CD 167 Extra',
'uploader': 'Cognitive Dissonance Podcast', 'uploader': 'Cognitive Dissonance Podcast',
'thumbnail': 're:^https?://.*$', 'thumbnail': 're:^https?://.*$',
},
}, },
{ 'skip': 'Patron-only content',
'url': 'https://www.patreon.com/creation?hid=1682498', }, {
'info_dict': { 'url': 'https://www.patreon.com/creation?hid=1682498',
'id': 'SU4fj_aEMVw', 'info_dict': {
'ext': 'mp4', 'id': 'SU4fj_aEMVw',
'title': 'I\'m on Patreon!', 'ext': 'mp4',
'uploader': 'TraciJHines', 'title': 'I\'m on Patreon!',
'thumbnail': 're:^https?://.*$', 'uploader': 'TraciJHines',
'upload_date': '20150211', 'thumbnail': 're:^https?://.*$',
'description': 'md5:c5a706b1f687817a3de09db1eb93acd4', 'upload_date': '20150211',
'uploader_id': 'TraciJHines', 'description': 'md5:c5a706b1f687817a3de09db1eb93acd4',
}, 'uploader_id': 'TraciJHines',
'params': { },
'noplaylist': True, 'params': {
'skip_download': True, 'noplaylist': True,
} 'skip_download': True,
} }
] }, {
'url': 'https://www.patreon.com/posts/episode-166-of-743933',
'only_matching': True,
}, {
'url': 'https://www.patreon.com/posts/743933',
'only_matching': True,
}]
# Currently Patreon exposes download URL via hidden CSS, so login is not # Currently Patreon exposes download URL via hidden CSS, so login is not
# needed. Keeping this commented for when this inevitably changes. # needed. Keeping this commented for when this inevitably changes.
@ -78,38 +89,48 @@ class PatreonIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) post = self._download_json(
title = self._og_search_title(webpage).strip() 'https://www.patreon.com/api/posts/' + video_id, video_id)
attributes = post['data']['attributes']
attach_fn = self._html_search_regex( title = attributes['title'].strip()
r'<div class="attach"><a target="_blank" href="([^"]+)">', image = attributes.get('image') or {}
webpage, 'attachment URL', default=None) info = {
embed = self._html_search_regex(
r'<div[^>]+id="watchCreation"[^>]*>\s*<iframe[^>]+src="([^"]+)"',
webpage, 'embedded URL', default=None)
if attach_fn is not None:
video_url = 'http://www.patreon.com' + attach_fn
thumbnail = self._og_search_thumbnail(webpage)
uploader = self._html_search_regex(
r'<strong>(.*?)</strong> is creating', webpage, 'uploader')
elif embed is not None:
return self.url_result(embed)
else:
playlist = self._parse_json(self._search_regex(
r'(?s)new\s+jPlayerPlaylist\(\s*\{\s*[^}]*},\s*(\[.*?,?\s*\])',
webpage, 'playlist JSON'),
video_id, transform_source=js_to_json)
data = playlist[0]
video_url = self._proto_relative_url(data['mp3'])
thumbnail = self._proto_relative_url(data.get('cover'))
uploader = data.get('artist')
return {
'id': video_id, 'id': video_id,
'url': video_url,
'ext': 'mp3',
'title': title, 'title': title,
'uploader': uploader, 'description': clean_html(attributes.get('content')),
'thumbnail': thumbnail, 'thumbnail': image.get('large_url') or image.get('url'),
'timestamp': parse_iso8601(attributes.get('published_at')),
'like_count': int_or_none(attributes.get('like_count')),
'comment_count': int_or_none(attributes.get('comment_count')),
} }
def add_file(file_data):
file_url = file_data.get('url')
if file_url:
info.update({
'url': file_url,
'ext': determine_ext(file_data.get('name'), 'mp3'),
})
for i in post.get('included', []):
i_type = i.get('type')
if i_type == 'attachment':
add_file(i.get('attributes') or {})
elif i_type == 'user':
user_attributes = i.get('attributes')
if user_attributes:
info.update({
'uploader': user_attributes.get('full_name'),
'uploader_url': user_attributes.get('url'),
})
if not info.get('url'):
add_file(attributes.get('post_file') or {})
if not info.get('url'):
info.update({
'_type': 'url',
'url': attributes['embed']['url'],
})
return info

View File

@ -2,31 +2,38 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
float_or_none, try_get,
int_or_none, urljoin,
parse_iso8601,
xpath_text,
) )
class PhilharmonieDeParisIE(InfoExtractor): class PhilharmonieDeParisIE(InfoExtractor):
IE_DESC = 'Philharmonie de Paris' IE_DESC = 'Philharmonie de Paris'
_VALID_URL = r'https?://live\.philharmoniedeparis\.fr/(?:[Cc]oncert/|misc/Playlist\.ashx\?id=)(?P<id>\d+)' _VALID_URL = r'''(?x)
https?://
(?:
live\.philharmoniedeparis\.fr/(?:[Cc]oncert/|misc/Playlist\.ashx\?id=)|
pad\.philharmoniedeparis\.fr/doc/CIMU/
)
(?P<id>\d+)
'''
_TESTS = [{ _TESTS = [{
'url': 'http://pad.philharmoniedeparis.fr/doc/CIMU/1086697/jazz-a-la-villette-knower',
'md5': 'a0a4b195f544645073631cbec166a2c2',
'info_dict': {
'id': '1086697',
'ext': 'mp4',
'title': 'Jazz à la Villette : Knower',
},
}, {
'url': 'http://live.philharmoniedeparis.fr/concert/1032066.html', 'url': 'http://live.philharmoniedeparis.fr/concert/1032066.html',
'info_dict': { 'info_dict': {
'id': '1032066', 'id': '1032066',
'ext': 'flv', 'title': 'md5:0a031b81807b3593cffa3c9a87a167a0',
'title': 'md5:d1f5585d87d041d07ce9434804bc8425',
'timestamp': 1428179400,
'upload_date': '20150404',
'duration': 6592.278,
}, },
'params': { 'playlist_mincount': 2,
# rtmp download
'skip_download': True,
}
}, { }, {
'url': 'http://live.philharmoniedeparis.fr/Concert/1030324.html', 'url': 'http://live.philharmoniedeparis.fr/Concert/1030324.html',
'only_matching': True, 'only_matching': True,
@ -34,45 +41,60 @@ class PhilharmonieDeParisIE(InfoExtractor):
'url': 'http://live.philharmoniedeparis.fr/misc/Playlist.ashx?id=1030324&track=&lang=fr', 'url': 'http://live.philharmoniedeparis.fr/misc/Playlist.ashx?id=1030324&track=&lang=fr',
'only_matching': True, 'only_matching': True,
}] }]
_LIVE_URL = 'https://live.philharmoniedeparis.fr'
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
concert = self._download_xml( config = self._download_json(
'http://live.philharmoniedeparis.fr/misc/Playlist.ashx?id=%s' % video_id, '%s/otoPlayer/config.ashx' % self._LIVE_URL, video_id, query={
video_id).find('./concert') 'id': video_id,
'lang': 'fr-FR',
})
formats = [] def extract_entry(source):
info_dict = { if not isinstance(source, dict):
'id': video_id, return
'title': xpath_text(concert, './titre', 'title', fatal=True), title = source.get('title')
'formats': formats, if not title:
} return
files = source.get('files')
fichiers = concert.find('./fichiers') if not isinstance(files, dict):
stream = fichiers.attrib['serveurstream'] return
for fichier in fichiers.findall('./fichier'): format_urls = set()
info_dict['duration'] = float_or_none(fichier.get('timecodefin')) formats = []
for quality, (format_id, suffix) in enumerate([('lq', ''), ('hq', '_hd')]): for format_id in ('mobile', 'desktop'):
format_url = fichier.get('url%s' % suffix) format_url = try_get(
if not format_url: files, lambda x: x[format_id]['file'], compat_str)
if not format_url or format_url in format_urls:
continue continue
formats.append({ format_urls.add(format_url)
'url': stream, m3u8_url = urljoin(self._LIVE_URL, format_url)
'play_path': format_url, formats.extend(self._extract_m3u8_formats(
'ext': 'flv', m3u8_url, video_id, 'mp4', entry_protocol='m3u8_native',
'format_id': format_id, m3u8_id='hls', fatal=False))
'width': int_or_none(concert.get('largeur%s' % suffix)), if not formats:
'height': int_or_none(concert.get('hauteur%s' % suffix)), return
'quality': quality, self._sort_formats(formats)
}) return {
self._sort_formats(formats) 'title': title,
'formats': formats,
}
date, hour = concert.get('date'), concert.get('heure') thumbnail = urljoin(self._LIVE_URL, config.get('image'))
if date and hour:
info_dict['timestamp'] = parse_iso8601(
'%s-%s-%sT%s:00' % (date[0:4], date[4:6], date[6:8], hour))
elif date:
info_dict['upload_date'] = date
return info_dict info = extract_entry(config)
if info:
info.update({
'id': video_id,
'thumbnail': thumbnail,
})
return info
entries = []
for num, chapter in enumerate(config['chapters'], start=1):
entry = extract_entry(chapter)
entry['id'] = '%s-%d' % (video_id, num)
entries.append(entry)
return self.playlist_result(entries, video_id, config.get('title'))

View File

@ -4,6 +4,7 @@ import collections
import json import json
import os import os
import random import random
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
@ -196,7 +197,10 @@ query viewClip {
if error: if error:
raise ExtractorError('Unable to login: %s' % error, expected=True) raise ExtractorError('Unable to login: %s' % error, expected=True)
if all(p not in response for p in ('__INITIAL_STATE__', '"currentUser"')): if all(not re.search(p, response) for p in (
r'__INITIAL_STATE__', r'["\']currentUser["\']',
# new layout?
r'>\s*Sign out\s*<')):
BLOCKED = 'Your account has been blocked due to suspicious activity' BLOCKED = 'Your account has been blocked due to suspicious activity'
if BLOCKED in response: if BLOCKED in response:
raise ExtractorError( raise ExtractorError(
@ -210,18 +214,26 @@ query viewClip {
raise ExtractorError('Unable to log in') raise ExtractorError('Unable to log in')
def _get_subtitles(self, author, clip_idx, lang, name, duration, video_id): def _get_subtitles(self, author, clip_idx, clip_id, lang, name, duration, video_id):
captions_post = { captions = None
'a': author, if clip_id:
'cn': clip_idx, captions = self._download_json(
'lc': lang, '%s/transcript/api/v1/caption/json/%s/%s'
'm': name, % (self._API_BASE, clip_id, lang), video_id,
} 'Downloading captions JSON', 'Unable to download captions JSON',
captions = self._download_json( fatal=False)
'%s/player/retrieve-captions' % self._API_BASE, video_id, if not captions:
'Downloading captions JSON', 'Unable to download captions JSON', captions_post = {
fatal=False, data=json.dumps(captions_post).encode('utf-8'), 'a': author,
headers={'Content-Type': 'application/json;charset=utf-8'}) 'cn': int(clip_idx),
'lc': lang,
'm': name,
}
captions = self._download_json(
'%s/player/retrieve-captions' % self._API_BASE, video_id,
'Downloading captions JSON', 'Unable to download captions JSON',
fatal=False, data=json.dumps(captions_post).encode('utf-8'),
headers={'Content-Type': 'application/json;charset=utf-8'})
if captions: if captions:
return { return {
lang: [{ lang: [{
@ -413,7 +425,7 @@ query viewClip {
# TODO: other languages? # TODO: other languages?
subtitles = self.extract_subtitles( subtitles = self.extract_subtitles(
author, clip_idx, 'en', name, duration, display_id) author, clip_idx, clip.get('clipId'), 'en', name, duration, display_id)
return { return {
'id': clip_id, 'id': clip_id,

View File

@ -103,7 +103,8 @@ class RutubeIE(RutubeBaseIE):
options = self._download_json( options = self._download_json(
'http://rutube.ru/api/play/options/%s/?format=json' % video_id, 'http://rutube.ru/api/play/options/%s/?format=json' % video_id,
video_id, 'Downloading options JSON') video_id, 'Downloading options JSON',
headers=self.geo_verification_headers())
formats = [] formats = []
for format_id, format_url in options['video_balancer'].items(): for format_id, format_url in options['video_balancer'].items():

View File

@ -44,3 +44,10 @@ class ParamountNetworkIE(MTVServicesInfoExtractor):
_FEED_URL = 'http://www.paramountnetwork.com/feeds/mrss/' _FEED_URL = 'http://www.paramountnetwork.com/feeds/mrss/'
_GEO_COUNTRIES = ['US'] _GEO_COUNTRIES = ['US']
def _extract_mgid(self, webpage):
cs = self._parse_json(self._search_regex(
r'window\.__DATA__\s*=\s*({.+})',
webpage, 'data'), None)['children']
c = next(c for c in cs if c.get('type') == 'VideoPlayer')
return c['props']['media']['video']['config']['uri']

View File

@ -212,8 +212,6 @@ class TEDIE(InfoExtractor):
http_url = None http_url = None
for format_id, resources in resources_.items(): for format_id, resources in resources_.items():
if not isinstance(resources, dict):
continue
if format_id == 'h264': if format_id == 'h264':
for resource in resources: for resource in resources:
h264_url = resource.get('file') h264_url = resource.get('file')
@ -242,6 +240,8 @@ class TEDIE(InfoExtractor):
'tbr': int_or_none(resource.get('bitrate')), 'tbr': int_or_none(resource.get('bitrate')),
}) })
elif format_id == 'hls': elif format_id == 'hls':
if not isinstance(resources, dict):
continue
stream_url = url_or_none(resources.get('stream')) stream_url = url_or_none(resources.get('stream'))
if not stream_url: if not stream_url:
continue continue

View File

@ -1,34 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class TV3IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tv3\.co\.nz/(?P<id>[^/]+)/tabid/\d+/articleID/\d+/MCat/\d+/Default\.aspx'
_TEST = {
'url': 'http://www.tv3.co.nz/MOTORSPORT-SRS-SsangYong-Hampton-Downs-Round-3/tabid/3692/articleID/121615/MCat/2915/Default.aspx',
'info_dict': {
'id': '4659127992001',
'ext': 'mp4',
'title': 'CRC Motorsport: SRS SsangYong Hampton Downs Round 3 - S2015 Ep3',
'description': 'SsangYong Racing Series returns for Round 3 with drivers from New Zealand and Australia taking to the grid at Hampton Downs raceway.',
'uploader_id': '3812193411001',
'upload_date': '20151213',
'timestamp': 1449975272,
},
'expected_warnings': [
'Failed to download MPD manifest'
],
'params': {
# m3u8 download
'skip_download': True,
},
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/3812193411001/default_default/index.html?videoId=%s'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
brightcove_id = self._search_regex(r'<param\s*name="@videoPlayer"\s*value="(\d+)"', webpage, 'brightcove id')
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)

View File

@ -551,6 +551,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
else: else:
config_re = [r' = {config:({.+?}),assets:', r'(?:[abc])=({.+?});'] config_re = [r' = {config:({.+?}),assets:', r'(?:[abc])=({.+?});']
config_re.append(r'\bvar\s+r\s*=\s*({.+?})\s*;') config_re.append(r'\bvar\s+r\s*=\s*({.+?})\s*;')
config_re.append(r'\bconfig\s*=\s*({.+?})\s*;')
config = self._search_regex(config_re, webpage, 'info section', config = self._search_regex(config_re, webpage, 'info section',
flags=re.DOTALL) flags=re.DOTALL)
config = json.loads(config) config = json.loads(config)

View File

@ -349,6 +349,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
(?:www\.)?hooktube\.com/| (?:www\.)?hooktube\.com/|
(?:www\.)?yourepeat\.com/| (?:www\.)?yourepeat\.com/|
tube\.majestyc\.net/| tube\.majestyc\.net/|
(?:www\.)?invidio\.us/|
youtube\.googleapis\.com/) # the various hostnames, with wildcard subdomains youtube\.googleapis\.com/) # the various hostnames, with wildcard subdomains
(?:.*?\#/)? # handle anchor (#/) redirect urls (?:.*?\#/)? # handle anchor (#/) redirect urls
(?: # the various things that can precede the ID: (?: # the various things that can precede the ID:
@ -1068,6 +1069,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'url': 'https://www.youtube.com/watch?v=MuAGGZNfUkU&list=RDMM', 'url': 'https://www.youtube.com/watch?v=MuAGGZNfUkU&list=RDMM',
'only_matching': True, 'only_matching': True,
}, },
{
'url': 'https://invidio.us/watch?v=BaW_jenozKc',
'only_matching': True,
},
] ]
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
@ -2419,7 +2424,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor): class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
IE_DESC = 'YouTube.com channels' IE_DESC = 'YouTube.com channels'
_VALID_URL = r'https?://(?:youtu\.be|(?:\w+\.)?youtube(?:-nocookie)?\.com)/channel/(?P<id>[0-9A-Za-z_-]+)' _VALID_URL = r'https?://(?:youtu\.be|(?:\w+\.)?youtube(?:-nocookie)?\.com|(?:www\.)?invidio\.us)/channel/(?P<id>[0-9A-Za-z_-]+)'
_TEMPLATE_URL = 'https://www.youtube.com/channel/%s/videos' _TEMPLATE_URL = 'https://www.youtube.com/channel/%s/videos'
_VIDEO_RE = r'(?:title="(?P<title>[^"]+)"[^>]+)?href="/watch\?v=(?P<id>[0-9A-Za-z_-]+)&?' _VIDEO_RE = r'(?:title="(?P<title>[^"]+)"[^>]+)?href="/watch\?v=(?P<id>[0-9A-Za-z_-]+)&?'
IE_NAME = 'youtube:channel' IE_NAME = 'youtube:channel'
@ -2440,6 +2445,9 @@ class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
'id': 'UUs0ifCMCm1icqRbqhUINa0w', 'id': 'UUs0ifCMCm1icqRbqhUINa0w',
'title': 'Uploads from Deus Ex', 'title': 'Uploads from Deus Ex',
}, },
}, {
'url': 'https://invidio.us/channel/UC23qupoDRn9YOAVzeoxjOQA',
'only_matching': True,
}] }]
@classmethod @classmethod

View File

@ -18,12 +18,12 @@ from ..utils import (
) )
class ZattooBaseIE(InfoExtractor): class ZattooPlatformBaseIE(InfoExtractor):
_NETRC_MACHINE = 'zattoo'
_HOST_URL = 'https://zattoo.com'
_power_guide_hash = None _power_guide_hash = None
def _host_url(self):
return 'https://%s' % self._HOST
def _login(self): def _login(self):
username, password = self._get_login_info() username, password = self._get_login_info()
if not username or not password: if not username or not password:
@ -33,13 +33,13 @@ class ZattooBaseIE(InfoExtractor):
try: try:
data = self._download_json( data = self._download_json(
'%s/zapi/v2/account/login' % self._HOST_URL, None, 'Logging in', '%s/zapi/v2/account/login' % self._host_url(), None, 'Logging in',
data=urlencode_postdata({ data=urlencode_postdata({
'login': username, 'login': username,
'password': password, 'password': password,
'remember': 'true', 'remember': 'true',
}), headers={ }), headers={
'Referer': '%s/login' % self._HOST_URL, 'Referer': '%s/login' % self._host_url(),
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8', 'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
}) })
except ExtractorError as e: except ExtractorError as e:
@ -53,7 +53,7 @@ class ZattooBaseIE(InfoExtractor):
def _real_initialize(self): def _real_initialize(self):
webpage = self._download_webpage( webpage = self._download_webpage(
self._HOST_URL, None, 'Downloading app token') self._host_url(), None, 'Downloading app token')
app_token = self._html_search_regex( app_token = self._html_search_regex(
r'appToken\s*=\s*(["\'])(?P<token>(?:(?!\1).)+?)\1', r'appToken\s*=\s*(["\'])(?P<token>(?:(?!\1).)+?)\1',
webpage, 'app token', group='token') webpage, 'app token', group='token')
@ -62,7 +62,7 @@ class ZattooBaseIE(InfoExtractor):
# Will setup appropriate cookies # Will setup appropriate cookies
self._request_webpage( self._request_webpage(
'%s/zapi/v2/session/hello' % self._HOST_URL, None, '%s/zapi/v2/session/hello' % self._host_url(), None,
'Opening session', data=urlencode_postdata({ 'Opening session', data=urlencode_postdata({
'client_app_token': app_token, 'client_app_token': app_token,
'uuid': compat_str(uuid4()), 'uuid': compat_str(uuid4()),
@ -75,7 +75,7 @@ class ZattooBaseIE(InfoExtractor):
def _extract_cid(self, video_id, channel_name): def _extract_cid(self, video_id, channel_name):
channel_groups = self._download_json( channel_groups = self._download_json(
'%s/zapi/v2/cached/channels/%s' % (self._HOST_URL, '%s/zapi/v2/cached/channels/%s' % (self._host_url(),
self._power_guide_hash), self._power_guide_hash),
video_id, 'Downloading channel list', video_id, 'Downloading channel list',
query={'details': False})['channel_groups'] query={'details': False})['channel_groups']
@ -93,28 +93,30 @@ class ZattooBaseIE(InfoExtractor):
def _extract_cid_and_video_info(self, video_id): def _extract_cid_and_video_info(self, video_id):
data = self._download_json( data = self._download_json(
'%s/zapi/program/details' % self._HOST_URL, '%s/zapi/v2/cached/program/power_details/%s' % (
self._host_url(), self._power_guide_hash),
video_id, video_id,
'Downloading video information', 'Downloading video information',
query={ query={
'program_id': video_id, 'program_ids': video_id,
'complete': True 'complete': True,
}) })
p = data['program'] p = data['programs'][0]
cid = p['cid'] cid = p['cid']
info_dict = { info_dict = {
'id': video_id, 'id': video_id,
'title': p.get('title') or p['episode_title'], 'title': p.get('t') or p['et'],
'description': p.get('description'), 'description': p.get('d'),
'thumbnail': p.get('image_url'), 'thumbnail': p.get('i_url'),
'creator': p.get('channel_name'), 'creator': p.get('channel_name'),
'episode': p.get('episode_title'), 'episode': p.get('et'),
'episode_number': int_or_none(p.get('episode_number')), 'episode_number': int_or_none(p.get('e_no')),
'season_number': int_or_none(p.get('season_number')), 'season_number': int_or_none(p.get('s_no')),
'release_year': int_or_none(p.get('year')), 'release_year': int_or_none(p.get('year')),
'categories': try_get(p, lambda x: x['categories'], list), 'categories': try_get(p, lambda x: x['c'], list),
'tags': try_get(p, lambda x: x['g'], list)
} }
return cid, info_dict return cid, info_dict
@ -126,11 +128,11 @@ class ZattooBaseIE(InfoExtractor):
if is_live: if is_live:
postdata_common.update({'timeshift': 10800}) postdata_common.update({'timeshift': 10800})
url = '%s/zapi/watch/live/%s' % (self._HOST_URL, cid) url = '%s/zapi/watch/live/%s' % (self._host_url(), cid)
elif record_id: elif record_id:
url = '%s/zapi/watch/recording/%s' % (self._HOST_URL, record_id) url = '%s/zapi/watch/recording/%s' % (self._host_url(), record_id)
else: else:
url = '%s/zapi/watch/recall/%s/%s' % (self._HOST_URL, cid, video_id) url = '%s/zapi/watch/recall/%s/%s' % (self._host_url(), cid, video_id)
formats = [] formats = []
for stream_type in ('dash', 'hls', 'hls5', 'hds'): for stream_type in ('dash', 'hls', 'hls5', 'hds'):
@ -201,13 +203,13 @@ class ZattooBaseIE(InfoExtractor):
return info_dict return info_dict
class QuicklineBaseIE(ZattooBaseIE): class QuicklineBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'quickline' _NETRC_MACHINE = 'quickline'
_HOST_URL = 'https://mobiltv.quickline.com' _HOST = 'mobiltv.quickline.com'
class QuicklineIE(QuicklineBaseIE): class QuicklineIE(QuicklineBaseIE):
_VALID_URL = r'https?://(?:www\.)?mobiltv\.quickline\.com/watch/(?P<channel>[^/]+)/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?%s/watch/(?P<channel>[^/]+)/(?P<id>[0-9]+)' % re.escape(QuicklineBaseIE._HOST)
_TEST = { _TEST = {
'url': 'https://mobiltv.quickline.com/watch/prosieben/130671867-maze-runner-die-auserwaehlten-in-der-brandwueste', 'url': 'https://mobiltv.quickline.com/watch/prosieben/130671867-maze-runner-die-auserwaehlten-in-der-brandwueste',
@ -220,7 +222,7 @@ class QuicklineIE(QuicklineBaseIE):
class QuicklineLiveIE(QuicklineBaseIE): class QuicklineLiveIE(QuicklineBaseIE):
_VALID_URL = r'https?://(?:www\.)?mobiltv\.quickline\.com/watch/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?%s/watch/(?P<id>[^/]+)' % re.escape(QuicklineBaseIE._HOST)
_TEST = { _TEST = {
'url': 'https://mobiltv.quickline.com/watch/srf1', 'url': 'https://mobiltv.quickline.com/watch/srf1',
@ -236,8 +238,18 @@ class QuicklineLiveIE(QuicklineBaseIE):
return self._extract_video(channel_name, video_id, is_live=True) return self._extract_video(channel_name, video_id, is_live=True)
class ZattooBaseIE(ZattooPlatformBaseIE):
_NETRC_MACHINE = 'zattoo'
_HOST = 'zattoo.com'
def _make_valid_url(tmpl, host):
return tmpl % re.escape(host)
class ZattooIE(ZattooBaseIE): class ZattooIE(ZattooBaseIE):
_VALID_URL = r'https?://(?:www\.)?zattoo\.com/watch/(?P<channel>[^/]+?)/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?' _VALID_URL_TEMPLATE = r'https?://(?:www\.)?%s/watch/(?P<channel>[^/]+?)/(?P<id>[0-9]+)[^/]+(?:/(?P<recid>[0-9]+))?'
_VALID_URL = _make_valid_url(_VALID_URL_TEMPLATE, ZattooBaseIE._HOST)
# Since regular videos are only available for 7 days and recorded videos # Since regular videos are only available for 7 days and recorded videos
# are only available for a specific user, we cannot have detailed tests. # are only available for a specific user, we cannot have detailed tests.
@ -269,3 +281,135 @@ class ZattooLiveIE(ZattooBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
channel_name = video_id = self._match_id(url) channel_name = video_id = self._match_id(url)
return self._extract_video(channel_name, video_id, is_live=True) return self._extract_video(channel_name, video_id, is_live=True)
class NetPlusIE(ZattooIE):
_NETRC_MACHINE = 'netplus'
_HOST = 'netplus.tv'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.netplus.tv/watch/abc/123-abc',
'only_matching': True,
}]
class MNetTVIE(ZattooIE):
_NETRC_MACHINE = 'mnettv'
_HOST = 'tvplus.m-net.de'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.tvplus.m-net.de/watch/abc/123-abc',
'only_matching': True,
}]
class WalyTVIE(ZattooIE):
_NETRC_MACHINE = 'walytv'
_HOST = 'player.waly.tv'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.player.waly.tv/watch/abc/123-abc',
'only_matching': True,
}]
class BBVTVIE(ZattooIE):
_NETRC_MACHINE = 'bbvtv'
_HOST = 'bbv-tv.net'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.bbv-tv.net/watch/abc/123-abc',
'only_matching': True,
}]
class VTXTVIE(ZattooIE):
_NETRC_MACHINE = 'vtxtv'
_HOST = 'vtxtv.ch'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.vtxtv.ch/watch/abc/123-abc',
'only_matching': True,
}]
class MyVisionTVIE(ZattooIE):
_NETRC_MACHINE = 'myvisiontv'
_HOST = 'myvisiontv.ch'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.myvisiontv.ch/watch/abc/123-abc',
'only_matching': True,
}]
class GlattvisionTVIE(ZattooIE):
_NETRC_MACHINE = 'glattvisiontv'
_HOST = 'iptv.glattvision.ch'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.iptv.glattvision.ch/watch/abc/123-abc',
'only_matching': True,
}]
class SAKTVIE(ZattooIE):
_NETRC_MACHINE = 'saktv'
_HOST = 'saktv.ch'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.saktv.ch/watch/abc/123-abc',
'only_matching': True,
}]
class EWETVIE(ZattooIE):
_NETRC_MACHINE = 'ewetv'
_HOST = 'tvonline.ewe.de'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.tvonline.ewe.de/watch/abc/123-abc',
'only_matching': True,
}]
class QuantumTVIE(ZattooIE):
_NETRC_MACHINE = 'quantumtv'
_HOST = 'quantum-tv.com'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.quantum-tv.com/watch/abc/123-abc',
'only_matching': True,
}]
class OsnatelTVIE(ZattooIE):
_NETRC_MACHINE = 'osnateltv'
_HOST = 'onlinetv.osnatel.de'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.onlinetv.osnatel.de/watch/abc/123-abc',
'only_matching': True,
}]
class EinsUndEinsTVIE(ZattooIE):
_NETRC_MACHINE = '1und1tv'
_HOST = '1und1.tv'
_VALID_URL = _make_valid_url(ZattooIE._VALID_URL_TEMPLATE, _HOST)
_TESTS = [{
'url': 'https://www.1und1.tv/watch/abc/123-abc',
'only_matching': True,
}]

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2018.09.18' __version__ = '2018.10.05'