Merge branch 'master' of github.com:rg3/youtube-dl

This commit is contained in:
Varun Verma 2016-09-16 22:40:31 +05:30
commit 212f9e5926
18 changed files with 246 additions and 103 deletions

View File

@ -6,8 +6,8 @@
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.11.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.15*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.11.1** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.15**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.09.11.1 [debug] youtube-dl version 2016.09.15
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}
@ -55,4 +55,4 @@ $ youtube-dl -v <your command line>
### Description of your *issue*, suggested solution and other information ### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible. Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* required an account credentials please provide them or explain how one can obtain them. If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View File

@ -1,8 +1,26 @@
version <unreleased> version 2016.09.15
Core
* Improve _hidden_inputs
+ Introduce improved explicit Adobe Pass support
+ Add --ap-mso to provide multiple-system operator identifier
+ Add --ap-username to provide MSO account username
+ Add --ap-password to provide MSO account password
+ Add --ap-list-mso to list all supported MSOs
+ Add support for Rogers Cable multiple-system operator (#10606)
Extractors Extractors
* [kwuo] Improve error detection (#10650) * [crunchyroll] Fix authentication (#10655)
* [twitch] Fix API calls (#10654, #10660)
+ [bellmedia] Add support for more Bell Media Television sites
* [franceinter] Fix extraction (#10538, #2105)
* [kuwo] Improve error detection (#10650)
+ [go] Add support for free full episodes (#10439)
* [bilibili] Fix extraction for specific videos (#10647) * [bilibili] Fix extraction for specific videos (#10647)
* [nhk] Fix extraction (#10633)
* [kaltura] Improve audio detection
* [kaltura] Skip chun format
+ [vimeo:ondemand] Pass Referer along with embed URL (#10624)
+ [nbc] Add support for NBC Olympics (#10361) + [nbc] Add support for NBC Olympics (#10361)

View File

@ -358,6 +358,17 @@ which means you can modify it, redistribute it or use it however you like.
-n, --netrc Use .netrc authentication data -n, --netrc Use .netrc authentication data
--video-password PASSWORD Video password (vimeo, smotri, youku) --video-password PASSWORD Video password (vimeo, smotri, youku)
## Adobe Pass Options:
--ap-mso MSO Adobe Pass multiple-system operator (TV
provider) identifier, use --ap-list-mso for
a list of available MSOs
--ap-username USERNAME Multiple-system operator account login
--ap-password PASSWORD Multiple-system operator account password.
If this option is left out, youtube-dl will
ask interactively.
--ap-list-mso List all supported multiple-system
operators
## Post-processing Options: ## Post-processing Options:
-x, --extract-audio Convert video files to audio-only files -x, --extract-audio Convert video files to audio-only files
(requires ffmpeg or avconv and ffprobe or (requires ffmpeg or avconv and ffprobe or

View File

@ -89,6 +89,7 @@
- **BeatportPro** - **BeatportPro**
- **Beeg** - **Beeg**
- **BehindKink** - **BehindKink**
- **BellMedia**
- **Bet** - **Bet**
- **Bigflix** - **Bigflix**
- **Bild**: Bild.de - **Bild**: Bild.de
@ -169,7 +170,6 @@
- **CSNNE** - **CSNNE**
- **CSpan**: C-SPAN - **CSpan**: C-SPAN
- **CtsNews**: 華視新聞 - **CtsNews**: 華視新聞
- **CTV**
- **CTVNews** - **CTVNews**
- **culturebox.francetvinfo.fr** - **culturebox.francetvinfo.fr**
- **CultureUnplugged** - **CultureUnplugged**
@ -445,6 +445,7 @@
- **NBA** - **NBA**
- **NBC** - **NBC**
- **NBCNews** - **NBCNews**
- **NBCOlympics**
- **NBCSports** - **NBCSports**
- **NBCSportsVPlayer** - **NBCSportsVPlayer**
- **ndr**: NDR.de - Norddeutscher Rundfunk - **ndr**: NDR.de - Norddeutscher Rundfunk

View File

@ -131,9 +131,9 @@ class YoutubeDL(object):
username: Username for authentication purposes. username: Username for authentication purposes.
password: Password for authentication purposes. password: Password for authentication purposes.
videopassword: Password for accessing a video. videopassword: Password for accessing a video.
ap_mso: Adobe Pass Multiple-system operator Identifier. ap_mso: Adobe Pass multiple-system operator identifier.
ap_username: TV Provider username for authentication purposes. ap_username: Multiple-system operator account username.
ap_password: TV Provider password for authentication purposes. ap_password: Multiple-system operator account password.
usenetrc: Use netrc for authentication instead. usenetrc: Use netrc for authentication instead.
verbose: Print additional info to stdout. verbose: Print additional info to stdout.
quiet: Do not print messages to stdout. quiet: Do not print messages to stdout.

View File

@ -32,6 +32,7 @@ MSO_INFO = {
class AdobePassIE(InfoExtractor): class AdobePassIE(InfoExtractor):
_SERVICE_PROVIDER_TEMPLATE = 'https://sp.auth.adobe.com/adobe-services/%s' _SERVICE_PROVIDER_TEMPLATE = 'https://sp.auth.adobe.com/adobe-services/%s'
_USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0' _USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0'
_MVPD_CACHE = 'ap-mvpd'
@staticmethod @staticmethod
def _get_mvpd_resource(provider_id, title, guid, rating): def _get_mvpd_resource(provider_id, title, guid, rating):
@ -85,7 +86,7 @@ class AdobePassIE(InfoExtractor):
guid = xml_text(resource, 'guid') guid = xml_text(resource, 'guid')
count = 0 count = 0
while count < 2: while count < 2:
requestor_info = self._downloader.cache.load('mvpd', requestor_id) or {} requestor_info = self._downloader.cache.load(self._MVPD_CACHE, requestor_id) or {}
authn_token = requestor_info.get('authn_token') authn_token = requestor_info.get('authn_token')
if authn_token and is_expired(authn_token, 'simpleTokenExpires'): if authn_token and is_expired(authn_token, 'simpleTokenExpires'):
authn_token = None authn_token = None
@ -125,12 +126,12 @@ class AdobePassIE(InfoExtractor):
'requestor_id': requestor_id, 'requestor_id': requestor_id,
}), headers=mvpd_headers) }), headers=mvpd_headers)
if '<pendingLogout' in session: if '<pendingLogout' in session:
self._downloader.cache.store('mvpd', requestor_id, {}) self._downloader.cache.store(self._MVPD_CACHE, requestor_id, {})
count += 1 count += 1
continue continue
authn_token = unescapeHTML(xml_text(session, 'authnToken')) authn_token = unescapeHTML(xml_text(session, 'authnToken'))
requestor_info['authn_token'] = authn_token requestor_info['authn_token'] = authn_token
self._downloader.cache.store('mvpd', requestor_id, requestor_info) self._downloader.cache.store(self._MVPD_CACHE, requestor_id, requestor_info)
authz_token = requestor_info.get(guid) authz_token = requestor_info.get(guid)
if authz_token and is_expired(authz_token, 'simpleTokenTTL'): if authz_token and is_expired(authz_token, 'simpleTokenTTL'):
@ -146,12 +147,12 @@ class AdobePassIE(InfoExtractor):
'userMeta': '1', 'userMeta': '1',
}), headers=mvpd_headers) }), headers=mvpd_headers)
if '<pendingLogout' in authorize: if '<pendingLogout' in authorize:
self._downloader.cache.store('mvpd', requestor_id, {}) self._downloader.cache.store(self._MVPD_CACHE, requestor_id, {})
count += 1 count += 1
continue continue
authz_token = unescapeHTML(xml_text(authorize, 'authzToken')) authz_token = unescapeHTML(xml_text(authorize, 'authzToken'))
requestor_info[guid] = authz_token requestor_info[guid] = authz_token
self._downloader.cache.store('mvpd', requestor_id, requestor_info) self._downloader.cache.store(self._MVPD_CACHE, requestor_id, requestor_info)
mvpd_headers.update({ mvpd_headers.update({
'ap_19': xml_text(authn_token, 'simpleSamlNameID'), 'ap_19': xml_text(authn_token, 'simpleSamlNameID'),
@ -167,7 +168,7 @@ class AdobePassIE(InfoExtractor):
'hashed_guid': 'false', 'hashed_guid': 'false',
}), headers=mvpd_headers) }), headers=mvpd_headers)
if '<pendingLogout' in short_authorize: if '<pendingLogout' in short_authorize:
self._downloader.cache.store('mvpd', requestor_id, {}) self._downloader.cache.store(self._MVPD_CACHE, requestor_id, {})
count += 1 count += 1
continue continue
return short_authorize return short_authorize

View File

@ -71,7 +71,7 @@ class CanvasIE(InfoExtractor):
webpage)).strip() webpage)).strip()
video_id = self._html_search_regex( video_id = self._html_search_regex(
r'data-video=(["\'])(?P<id>.+?)\1', webpage, 'video id', group='id') r'data-video=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video id', group='id')
data = self._download_json( data = self._download_json(
'https://mediazone.vrt.be/api/v1/%s/assets/%s' 'https://mediazone.vrt.be/api/v1/%s/assets/%s'

View File

@ -674,23 +674,26 @@ class InfoExtractor(object):
username = info[0] username = info[0]
password = info[2] password = info[2]
else: else:
raise netrc.NetrcParseError('No authenticators for %s' % netrc_machine) raise netrc.NetrcParseError(
'No authenticators for %s' % netrc_machine)
except (IOError, netrc.NetrcParseError) as err: except (IOError, netrc.NetrcParseError) as err:
self._downloader.report_warning('parsing .netrc: %s' % error_to_compat_str(err)) self._downloader.report_warning(
'parsing .netrc: %s' % error_to_compat_str(err))
return (username, password) return username, password
def _get_login_info(self, username_option='username', password_option='password', netrc_machine=None): def _get_login_info(self, username_option='username', password_option='password', netrc_machine=None):
""" """
Get the login info as (username, password) Get the login info as (username, password)
It will look in the netrc file using the _NETRC_MACHINE value First look for the manually specified credentials using username_option
and password_option as keys in params dictionary. If no such credentials
available look in the netrc file using the netrc_machine or _NETRC_MACHINE
value.
If there's no info available, return (None, None) If there's no info available, return (None, None)
""" """
if self._downloader is None: if self._downloader is None:
return (None, None) return (None, None)
username = None
password = None
downloader_params = self._downloader.params downloader_params = self._downloader.params
# Attempt to use provided username and password or .netrc data # Attempt to use provided username and password or .netrc data
@ -700,7 +703,7 @@ class InfoExtractor(object):
else: else:
username, password = self._get_netrc_login_info(netrc_machine) username, password = self._get_netrc_login_info(netrc_machine)
return (username, password) return username, password
def _get_tfa_info(self, note='two-factor verification code'): def _get_tfa_info(self, note='two-factor verification code'):
""" """
@ -888,16 +891,16 @@ class InfoExtractor(object):
def _hidden_inputs(html): def _hidden_inputs(html):
html = re.sub(r'<!--(?:(?!<!--).)*-->', '', html) html = re.sub(r'<!--(?:(?!<!--).)*-->', '', html)
hidden_inputs = {} hidden_inputs = {}
for input in re.findall(r'(?i)<input([^>]+)>', html): for input in re.findall(r'(?i)(<input[^>]+>)', html):
if not re.search(r'type=(["\'])(?:hidden|submit)\1', input): attrs = extract_attributes(input)
if not input:
continue continue
name = re.search(r'(?:name|id)=(["\'])(?P<value>.+?)\1', input) if attrs.get('type') not in ('hidden', 'submit'):
if not name:
continue continue
value = re.search(r'value=(["\'])(?P<value>.*?)\1', input) name = attrs.get('name') or attrs.get('id')
if not value: value = attrs.get('value')
continue if name and value is not None:
hidden_inputs[name.group('value')] = value.group('value') hidden_inputs[name] = value
return hidden_inputs return hidden_inputs
def _form_hidden_inputs(self, form_id, html): def _form_hidden_inputs(self, form_id, html):

View File

@ -34,22 +34,51 @@ from ..aes import (
class CrunchyrollBaseIE(InfoExtractor): class CrunchyrollBaseIE(InfoExtractor):
_LOGIN_URL = 'https://www.crunchyroll.com/login'
_LOGIN_FORM = 'login_form'
_NETRC_MACHINE = 'crunchyroll' _NETRC_MACHINE = 'crunchyroll'
def _login(self): def _login(self):
(username, password) = self._get_login_info() (username, password) = self._get_login_info()
if username is None: if username is None:
return return
self.report_login()
login_url = 'https://www.crunchyroll.com/?a=formhandler' login_page = self._download_webpage(
data = urlencode_postdata({ self._LOGIN_URL, None, 'Downloading login page')
'formname': 'RpcApiUser_Login',
'name': username, login_form_str = self._search_regex(
'password': password, r'(?P<form><form[^>]+?id=(["\'])%s\2[^>]*>)' % self._LOGIN_FORM,
login_page, 'login form', group='form')
post_url = extract_attributes(login_form_str).get('action')
if not post_url:
post_url = self._LOGIN_URL
elif not post_url.startswith('http'):
post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
login_form = self._form_hidden_inputs(self._LOGIN_FORM, login_page)
login_form.update({
'login_form[name]': username,
'login_form[password]': password,
}) })
login_request = sanitized_Request(login_url, data)
login_request.add_header('Content-Type', 'application/x-www-form-urlencoded') response = self._download_webpage(
self._download_webpage(login_request, None, False, 'Wrong login info') post_url, None, 'Logging in', 'Wrong login info',
data=urlencode_postdata(login_form),
headers={'Content-Type': 'application/x-www-form-urlencoded'})
# Successful login
if '<title>Redirecting' in response:
return
error = self._html_search_regex(
'(?s)<ul[^>]+class=["\']messages["\'][^>]*>(.+?)</ul>',
response, 'error message', default=None)
if error:
raise ExtractorError('Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_initialize(self): def _real_initialize(self):
self._login() self._login()

View File

@ -10,14 +10,14 @@ class FranceInterIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?franceinter\.fr/emissions/(?P<id>[^?#]+)' _VALID_URL = r'https?://(?:www\.)?franceinter\.fr/emissions/(?P<id>[^?#]+)'
_TEST = { _TEST = {
'url': 'https://www.franceinter.fr/emissions/la-marche-de-l-histoire/la-marche-de-l-histoire-18-decembre-2013', 'url': 'https://www.franceinter.fr/emissions/la-tete-au-carre/la-tete-au-carre-14-septembre-2016',
'md5': '4764932e466e6f6c79c317d2e74f6884', 'md5': '4e3aeb58fe0e83d7b0581fa213c409d0',
'info_dict': { 'info_dict': {
'id': 'la-marche-de-l-histoire/la-marche-de-l-histoire-18-decembre-2013', 'id': 'la-tete-au-carre/la-tete-au-carre-14-septembre-2016',
'ext': 'mp3', 'ext': 'mp3',
'title': 'LHistoire dans les jeux vidéo du 18 décembre 2013 - France Inter', 'title': 'Et si les rêves pouvaient nous aider à agir dans notre vie quotidienne ?',
'description': 'md5:7f2ce449894d1e585932273080fb410d', 'description': 'md5:a245dd62cf5bf51de915f8d9956d180a',
'upload_date': '20131218', 'upload_date': '20160914',
}, },
} }
@ -39,7 +39,7 @@ class FranceInterIE(InfoExtractor):
if upload_date_str: if upload_date_str:
upload_date_list = upload_date_str.split() upload_date_list = upload_date_str.split()
upload_date_list.reverse() upload_date_list.reverse()
upload_date_list[1] = compat_str(month_by_name(upload_date_list[1], lang='fr')) upload_date_list[1] = '%02d' % (month_by_name(upload_date_list[1], lang='fr') or 0)
upload_date = ''.join(upload_date_list) upload_date = ''.join(upload_date_list)
else: else:
upload_date = None upload_date = None

View File

@ -165,7 +165,7 @@ class NFLIE(InfoExtractor):
group='config')) group='config'))
# For articles, the id in the url is not the video id # For articles, the id in the url is not the video id
video_id = self._search_regex( video_id = self._search_regex(
r'(?:<nflcs:avplayer[^>]+data-content[Ii]d\s*=\s*|content[Ii]d\s*:\s*)(["\'])(?P<id>.+?)\1', r'(?:<nflcs:avplayer[^>]+data-content[Ii]d\s*=\s*|content[Ii]d\s*:\s*)(["\'])(?P<id>(?:(?!\1).)+)\1',
webpage, 'video id', default=video_id, group='id') webpage, 'video id', default=video_id, group='id')
config = self._download_json(config_url, video_id, 'Downloading player config') config = self._download_json(config_url, video_id, 'Downloading player config')
url_template = NFLIE.prepend_host( url_template = NFLIE.prepend_host(

View File

@ -429,7 +429,7 @@ class SchoolTVIE(InfoExtractor):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
video_id = self._search_regex( video_id = self._search_regex(
r'data-mid=(["\'])(?P<id>.+?)\1', webpage, 'video_id', group='id') r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'ie_key': 'NPO', 'ie_key': 'NPO',

View File

@ -13,6 +13,7 @@ from ..utils import (
xpath_element, xpath_element,
ExtractorError, ExtractorError,
determine_protocol, determine_protocol,
unsmuggle_url,
) )
@ -35,28 +36,51 @@ class RadioCanadaIE(InfoExtractor):
} }
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
app_code, video_id = re.match(self._VALID_URL, url).groups() app_code, video_id = re.match(self._VALID_URL, url).groups()
device_types = ['ipad', 'android'] metadata = self._download_xml(
'http://api.radio-canada.ca/metaMedia/v1/index.ashx',
video_id, note='Downloading metadata XML', query={
'appCode': app_code,
'idMedia': video_id,
})
def get_meta(name):
el = find_xpath_attr(metadata, './/Meta', 'name', name)
return el.text if el is not None else None
if get_meta('protectionType'):
raise ExtractorError('This video is DRM protected.', expected=True)
device_types = ['ipad']
if app_code != 'toutv': if app_code != 'toutv':
device_types.append('flash') device_types.append('flash')
if not smuggled_data:
device_types.append('android')
formats = [] formats = []
# TODO: extract f4m formats # TODO: extract f4m formats
# f4m formats can be extracted using flashhd device_type but they produce unplayable file # f4m formats can be extracted using flashhd device_type but they produce unplayable file
for device_type in device_types: for device_type in device_types:
v_data = self._download_xml( validation_url = 'http://api.radio-canada.ca/validationMedia/v1/Validation.ashx'
'http://api.radio-canada.ca/validationMedia/v1/Validation.ashx', query = {
video_id, note='Downloading %s XML' % device_type, query={
'appCode': app_code, 'appCode': app_code,
'idMedia': video_id, 'idMedia': video_id,
'connectionType': 'broadband', 'connectionType': 'broadband',
'multibitrate': 'true', 'multibitrate': 'true',
'deviceType': device_type, 'deviceType': device_type,
}
if smuggled_data:
validation_url = 'https://services.radio-canada.ca/media/validation/v2/'
query.update(smuggled_data)
else:
query.update({
# paysJ391wsHjbOJwvCs26toz and bypasslock are used to bypass geo-restriction # paysJ391wsHjbOJwvCs26toz and bypasslock are used to bypass geo-restriction
'paysJ391wsHjbOJwvCs26toz': 'CA', 'paysJ391wsHjbOJwvCs26toz': 'CA',
'bypasslock': 'NZt5K62gRqfc', 'bypasslock': 'NZt5K62gRqfc',
}, fatal=False) })
v_data = self._download_xml(validation_url, video_id, note='Downloading %s XML' % device_type, query=query, fatal=False)
v_url = xpath_text(v_data, 'url') v_url = xpath_text(v_data, 'url')
if not v_url: if not v_url:
continue continue
@ -101,17 +125,6 @@ class RadioCanadaIE(InfoExtractor):
f4m_id='hds', fatal=False)) f4m_id='hds', fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
metadata = self._download_xml(
'http://api.radio-canada.ca/metaMedia/v1/index.ashx',
video_id, note='Downloading metadata XML', query={
'appCode': app_code,
'idMedia': video_id,
})
def get_meta(name):
el = find_xpath_attr(metadata, './/Meta', 'name', name)
return el.text if el is not None else None
return { return {
'id': video_id, 'id': video_id,
'title': get_meta('Title'), 'title': get_meta('Title'),

View File

@ -2,12 +2,22 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import int_or_none from ..utils import (
int_or_none,
js_to_json,
ExtractorError,
urlencode_postdata,
extract_attributes,
smuggle_url,
)
class TouTvIE(InfoExtractor): class TouTvIE(InfoExtractor):
_NETRC_MACHINE = 'toutv'
IE_NAME = 'tou.tv' IE_NAME = 'tou.tv'
_VALID_URL = r'https?://ici\.tou\.tv/(?P<id>[a-zA-Z0-9_-]+/S[0-9]+E[0-9]+)' _VALID_URL = r'https?://ici\.tou\.tv/(?P<id>[a-zA-Z0-9_-]+/S[0-9]+E[0-9]+)'
_access_token = None
_claims = None
_TEST = { _TEST = {
'url': 'http://ici.tou.tv/garfield-tout-court/S2015E17', 'url': 'http://ici.tou.tv/garfield-tout-court/S2015E17',
@ -22,18 +32,64 @@ class TouTvIE(InfoExtractor):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'skip': '404 Not Found',
} }
def _real_initialize(self):
email, password = self._get_login_info()
if email is None:
return
state = 'http://ici.tou.tv//'
webpage = self._download_webpage(state, None, 'Downloading homepage')
toutvlogin = self._parse_json(self._search_regex(
r'(?s)toutvlogin\s*=\s*({.+?});', webpage, 'toutvlogin'), None, js_to_json)
authorize_url = toutvlogin['host'] + '/auth/oauth/v2/authorize'
login_webpage = self._download_webpage(
authorize_url, None, 'Downloading login page', query={
'client_id': toutvlogin['clientId'],
'redirect_uri': 'https://ici.tou.tv/login/loginCallback',
'response_type': 'token',
'scope': 'media-drmt openid profile email id.write media-validation.read.privileged',
'state': state,
})
login_form = self._search_regex(
r'(?s)(<form[^>]+id="Form-login".+?</form>)', login_webpage, 'login form')
form_data = self._hidden_inputs(login_form)
form_data.update({
'login-email': email,
'login-password': password,
})
post_url = extract_attributes(login_form).get('action') or authorize_url
_, urlh = self._download_webpage_handle(
post_url, None, 'Logging in', data=urlencode_postdata(form_data))
self._access_token = self._search_regex(
r'access_token=([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
urlh.geturl(), 'access token')
self._claims = self._download_json(
'https://services.radio-canada.ca/media/validation/v2/getClaims',
None, 'Extracting Claims', query={
'token': self._access_token,
'access_token': self._access_token,
})['claims']
def _real_extract(self, url): def _real_extract(self, url):
path = self._match_id(url) path = self._match_id(url)
metadata = self._download_json('http://ici.tou.tv/presentation/%s' % path, path) metadata = self._download_json('http://ici.tou.tv/presentation/%s' % path, path)
if metadata.get('IsDrm'):
raise ExtractorError('This video is DRM protected.', expected=True)
video_id = metadata['IdMedia'] video_id = metadata['IdMedia']
details = metadata['Details'] details = metadata['Details']
title = details['OriginalTitle'] title = details['OriginalTitle']
video_url = 'radiocanada:%s:%s' % (metadata.get('AppCode', 'toutv'), video_id)
if self._access_token and self._claims:
video_url = smuggle_url(video_url, {
'access_token': self._access_token,
'claims': self._claims,
})
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'url': 'radiocanada:%s:%s' % (metadata.get('AppCode', 'toutv'), video_id), 'url': video_url,
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'thumbnail': details.get('ImageUrl'), 'thumbnail': details.get('ImageUrl'),

View File

@ -2,9 +2,13 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none,
parse_iso8601, parse_iso8601,
try_get,
update_url_query,
) )
@ -65,36 +69,47 @@ class TV4IE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
info = self._download_json( info = self._download_json(
'http://www.tv4play.se/player/assets/%s.json' % video_id, video_id, 'Downloading video info JSON') 'http://www.tv4play.se/player/assets/%s.json' % video_id,
video_id, 'Downloading video info JSON')
# If is_geo_restricted is true, it doesn't necessarily mean we can't download it # If is_geo_restricted is true, it doesn't necessarily mean we can't download it
if info['is_geo_restricted']: if info.get('is_geo_restricted'):
self.report_warning('This content might not be available in your country due to licensing restrictions.') self.report_warning('This content might not be available in your country due to licensing restrictions.')
if info['requires_subscription']: if info.get('requires_subscription'):
raise ExtractorError('This content requires subscription.', expected=True) raise ExtractorError('This content requires subscription.', expected=True)
sources_data = self._download_json( title = info['title']
'https://prima.tv4play.se/api/web/asset/%s/play.json?protocol=http&videoFormat=MP4' % video_id, video_id, 'Downloading sources JSON')
sources = sources_data['playback']
formats = [] formats = []
for item in sources.get('items', {}).get('item', []): # http formats are linked with unresolvable host
ext, bitrate = item['mediaFormat'], item['bitrate'] for kind in ('hls', ''):
formats.append({ data = self._download_json(
'format_id': '%s_%s' % (ext, bitrate), 'https://prima.tv4play.se/api/web/asset/%s/play.json' % video_id,
'tbr': bitrate, video_id, 'Downloading sources JSON', query={
'ext': ext, 'protocol': kind,
'url': item['url'], 'videoFormat': 'MP4+WEBVTTS+WEBVTT',
}) })
item = try_get(data, lambda x: x['playback']['items']['item'], dict)
manifest_url = item.get('url')
if not isinstance(manifest_url, compat_str):
continue
if kind == 'hls':
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=kind, fatal=False))
else:
formats.extend(self._extract_f4m_formats(
update_url_query(manifest_url, {'hdcore': '3.8.0'}),
video_id, f4m_id='hds', fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
return { return {
'id': video_id, 'id': video_id,
'title': info['title'], 'title': title,
'formats': formats, 'formats': formats,
'description': info.get('description'), 'description': info.get('description'),
'timestamp': parse_iso8601(info.get('broadcast_date_time')), 'timestamp': parse_iso8601(info.get('broadcast_date_time')),
'duration': info.get('duration'), 'duration': int_or_none(info.get('duration')),
'thumbnail': info.get('image'), 'thumbnail': info.get('image'),
'is_live': sources.get('live'), 'is_live': info.get('is_live') is True,
} }

View File

@ -32,6 +32,7 @@ class TwitchBaseIE(InfoExtractor):
_API_BASE = 'https://api.twitch.tv' _API_BASE = 'https://api.twitch.tv'
_USHER_BASE = 'https://usher.ttvnw.net' _USHER_BASE = 'https://usher.ttvnw.net'
_LOGIN_URL = 'http://www.twitch.tv/login' _LOGIN_URL = 'http://www.twitch.tv/login'
_CLIENT_ID = 'jzkbprff40iqj646a697cyrvl0zt2m6'
_NETRC_MACHINE = 'twitch' _NETRC_MACHINE = 'twitch'
def _handle_error(self, response): def _handle_error(self, response):
@ -44,15 +45,9 @@ class TwitchBaseIE(InfoExtractor):
expected=True) expected=True)
def _call_api(self, path, item_id, note): def _call_api(self, path, item_id, note):
headers = {
'Referer': 'http://api.twitch.tv/crossdomain/receiver.html?v=2',
'X-Requested-With': 'XMLHttpRequest',
}
for cookie in self._downloader.cookiejar:
if cookie.name == 'api_token':
headers['Twitch-Api-Token'] = cookie.value
response = self._download_json( response = self._download_json(
'%s/%s' % (self._API_BASE, path), item_id, note) '%s/%s' % (self._API_BASE, path), item_id, note,
headers={'Client-ID': self._CLIENT_ID})
self._handle_error(response) self._handle_error(response)
return response return response

View File

@ -355,19 +355,19 @@ def parseOpts(overrideArguments=None):
adobe_pass.add_option( adobe_pass.add_option(
'--ap-mso', '--ap-mso',
dest='ap_mso', metavar='MSO', dest='ap_mso', metavar='MSO',
help='Adobe Pass Multiple-system operator Identifier') help='Adobe Pass multiple-system operator (TV provider) identifier, use --ap-list-mso for a list of available MSOs')
adobe_pass.add_option( adobe_pass.add_option(
'--ap-username', '--ap-username',
dest='ap_username', metavar='USERNAME', dest='ap_username', metavar='USERNAME',
help='TV Provider Login with this account ID') help='Multiple-system operator account login')
adobe_pass.add_option( adobe_pass.add_option(
'--ap-password', '--ap-password',
dest='ap_password', metavar='PASSWORD', dest='ap_password', metavar='PASSWORD',
help='TV Provider Account password. If this option is left out, youtube-dl will ask interactively.') help='Multiple-system operator account password. If this option is left out, youtube-dl will ask interactively.')
adobe_pass.add_option( adobe_pass.add_option(
'--ap-list-mso', '--ap-list-mso',
action='store_true', dest='ap_list_mso', default=False, action='store_true', dest='ap_list_mso', default=False,
help='List all supported TV Providers') help='List all supported multiple-system operators')
video_format = optparse.OptionGroup(parser, 'Video Format Options') video_format = optparse.OptionGroup(parser, 'Video Format Options')
video_format.add_option( video_format.add_option(
@ -831,6 +831,7 @@ def parseOpts(overrideArguments=None):
parser.add_option_group(video_format) parser.add_option_group(video_format)
parser.add_option_group(subtitles) parser.add_option_group(subtitles)
parser.add_option_group(authentication) parser.add_option_group(authentication)
parser.add_option_group(adobe_pass)
parser.add_option_group(postproc) parser.add_option_group(postproc)
if overrideArguments is not None: if overrideArguments is not None:

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2016.09.11.1' __version__ = '2016.09.15'