Merge remote-tracking branch 'refs/remotes/rg3/master' into tunepk

This commit is contained in:
Irfan Charania 2017-02-26 10:23:17 -08:00
commit caf1bda6eb
52 changed files with 979 additions and 260 deletions

View File

@ -6,8 +6,8 @@
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.17*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.02.24.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.17** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.02.24.1**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.02.17 [debug] youtube-dl version 2017.02.24.1
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

102
ChangeLog
View File

@ -1,3 +1,105 @@
version <unreleased>
Extractors
+ [MDR] Recognize more URL patterns (#12169)
* [vevo] Fix extraction for videos with the new streams/streamsV3 format
(#11719)
+ [njpwworld] Add new extractor (#11561)
version 2017.02.24.1
Extractors
* [noco] Modernize
* [noco] Switch login URL to https (#12246)
+ [thescene] Extract more metadata
* [thescene] Fix extraction (#12235)
+ [tubitv] Use geo bypass mechanism
* [openload] Fix extraction (#10408)
+ [ivi] Raise GeoRestrictedError
version 2017.02.24
Core
* [options] Hide deprecated options from --help
* [options] Deprecate --autonumber-size
+ [YoutubeDL] Add support for string formatting operations in output template
(#5185, #5748, #6841, #9929, #9966 #9978, #12189)
Extractors
+ [lynda:course] Add webpage extraction fallback (#12238)
* [go] Sign all uplynk URLs and use geo bypass only for free videos
(#12087, #12210)
+ [skylinewebcams] Add support for skylinewebcams.com (#12221)
+ [instagram] Add support for multi video posts (#12226)
+ [crunchyroll] Extract playlist entries ids
* [mgtv] Fix extraction
+ [sohu] Raise GeoRestrictedError
+ [leeco] Raise GeoRestrictedError and use geo bypass mechanism
version 2017.02.22
Extractors
* [crunchyroll] Fix descriptions with double quotes (#12124)
* [dailymotion] Make comment count optional (#12209)
+ [vidzi] Add support for vidzi.cc (#12213)
+ [24video] Add support for 24video.tube (#12217)
+ [crackle] Use geo bypass mechanism
+ [viewster] Use geo verification headers
+ [tfo] Improve geo restriction detection and use geo bypass mechanism
+ [telequebec] Use geo bypass mechanism
+ [limelight] Extract PlaylistService errors and improve geo restriction
detection
version 2017.02.21
Core
* [extractor/common] Allow calling _initialize_geo_bypass from extractors
(#11970)
+ [adobepass] Add support for Time Warner Cable (#12191)
+ [travis] Run tests in parallel
+ [downloader/ism] Honor HTTP headers when downloading fragments
+ [downloader/dash] Honor HTTP headers when downloading fragments
+ [utils] Add GeoUtils class for working with geo tools and GeoUtils.random_ipv4
+ Add option --geo-bypass-country for explicit geo bypass on behalf of
specified country
+ Add options to control geo bypass mechanism --geo-bypass and --no-geo-bypass
+ Add experimental geo restriction bypass mechanism based on faking
X-Forwarded-For HTTP header
+ [utils] Introduce GeoRestrictedError for geo restricted videos
+ [utils] Introduce YoutubeDLError base class for all youtube-dl exceptions
Extractors
+ [ninecninemedia] Use geo bypass mechanism
* [spankbang] Make uploader optional (#12193)
+ [iprima] Improve geo restriction detection and disable geo bypass
* [iprima] Modernize
* [commonmistakes] Disable UnicodeBOM extractor test for python 3.2
+ [prosiebensat1] Throw ExtractionError on unsupported page type (#12180)
* [nrk] Update _API_HOST and relax _VALID_URL
+ [tv4] Bypass geo restriction and improve detection
* [tv4] Switch to hls3 protocol (#12177)
+ [viki] Improve geo restriction detection
+ [vgtv] Improve geo restriction detection
+ [srgssr] Improve geo restriction detection
+ [vbox7] Improve geo restriction detection and use geo bypass mechanism
+ [svt] Improve geo restriction detection and use geo bypass mechanism
+ [pbs] Improve geo restriction detection and use geo bypass mechanism
+ [ondemandkorea] Improve geo restriction detection and use geo bypass mechanism
+ [nrk] Improve geo restriction detection and use geo bypass mechanism
+ [itv] Improve geo restriction detection and use geo bypass mechanism
+ [go] Improve geo restriction detection and use geo bypass mechanism
+ [dramafever] Improve geo restriction detection and use geo bypass mechanism
* [brightcove:legacy] Restrict videoPlayer value (#12040)
+ [tvn24] Add support for tvn24.pl and tvn24bis.pl (#11679)
+ [thisav] Add support for HTML5 media (#11771)
* [metacafe] Bypass family filter (#10371)
* [viceland] Improve info extraction
version 2017.02.17 version 2017.02.17
Extractors Extractors

182
README.md
View File

@ -99,11 +99,21 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--source-address IP Client-side IP address to bind to --source-address IP Client-side IP address to bind to
-4, --force-ipv4 Make all connections via IPv4 -4, --force-ipv4 Make all connections via IPv4
-6, --force-ipv6 Make all connections via IPv6 -6, --force-ipv6 Make all connections via IPv6
## Geo Restriction:
--geo-verification-proxy URL Use this proxy to verify the IP address for --geo-verification-proxy URL Use this proxy to verify the IP address for
some geo-restricted sites. The default some geo-restricted sites. The default
proxy specified by --proxy (or none, if the proxy specified by --proxy (or none, if the
options is not present) is used for the options is not present) is used for the
actual downloading. actual downloading.
--geo-bypass Bypass geographic restriction via faking
X-Forwarded-For HTTP header (experimental)
--no-geo-bypass Do not bypass geographic restriction via
faking X-Forwarded-For HTTP header
(experimental)
--geo-bypass-country CODE Force bypass geographic restriction with
explicitly provided two-letter ISO 3166-2
country code (experimental)
## Video Selection: ## Video Selection:
--playlist-start NUMBER Playlist video to start at (default is 1) --playlist-start NUMBER Playlist video to start at (default is 1)
@ -140,17 +150,19 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
check if the key is not present, key > check if the key is not present, key >
NUMBER (like "comment_count > 12", also NUMBER (like "comment_count > 12", also
works with >=, <, <=, !=, =) to compare works with >=, <, <=, !=, =) to compare
against a number, and & to require multiple against a number, key = 'LITERAL' (like
matches. Values which are not known are "uploader = 'Mike Smith'", also works with
excluded unless you put a question mark (?) !=) to match against a string literal and &
after the operator. For example, to only to require multiple matches. Values which
match videos that have been liked more than are not known are excluded unless you put a
100 times and disliked less than 50 times question mark (?) after the operator. For
(or the dislike functionality is not example, to only match videos that have
available at the given service), but who been liked more than 100 times and disliked
also have a description, use --match-filter less than 50 times (or the dislike
"like_count > 100 & dislike_count <? 50 & functionality is not available at the given
description" . service), but who also have a description,
use --match-filter "like_count > 100 &
dislike_count <? 50 & description" .
--no-playlist Download only the video, if the URL refers --no-playlist Download only the video, if the URL refers
to a video and a playlist. to a video and a playlist.
--yes-playlist Download the playlist, if the URL refers to --yes-playlist Download the playlist, if the URL refers to
@ -205,21 +217,11 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
--id Use only video ID in file name --id Use only video ID in file name
-o, --output TEMPLATE Output filename template, see the "OUTPUT -o, --output TEMPLATE Output filename template, see the "OUTPUT
TEMPLATE" for all the info TEMPLATE" for all the info
--autonumber-size NUMBER Specify the number of digits in
%(autonumber)s when it is present in output
filename template or --auto-number option
is given (default is 5)
--autonumber-start NUMBER Specify the start value for %(autonumber)s --autonumber-start NUMBER Specify the start value for %(autonumber)s
(default is 1) (default is 1)
--restrict-filenames Restrict filenames to only ASCII --restrict-filenames Restrict filenames to only ASCII
characters, and avoid "&" and spaces in characters, and avoid "&" and spaces in
filenames filenames
-A, --auto-number [deprecated; use -o
"%(autonumber)s-%(title)s.%(ext)s" ] Number
downloaded files starting from 00000
-t, --title [deprecated] Use title in file name
(default)
-l, --literal [deprecated] Alias of --title
-w, --no-overwrites Do not overwrite files -w, --no-overwrites Do not overwrite files
-c, --continue Force resume of partially downloaded files. -c, --continue Force resume of partially downloaded files.
By default, youtube-dl will resume By default, youtube-dl will resume
@ -474,87 +476,89 @@ The `-o` option allows users to indicate a template for the output file names.
**tl;dr:** [navigate me to examples](#output-template-examples). **tl;dr:** [navigate me to examples](#output-template-examples).
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences have the format `%(NAME)s`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a lowercase S. Allowed names are: The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a formatting operations. Allowed names along with sequence type are:
- `id`: Video identifier - `id` (string): Video identifier
- `title`: Video title - `title` (string): Video title
- `url`: Video URL - `url` (string): Video URL
- `ext`: Video filename extension - `ext` (string): Video filename extension
- `alt_title`: A secondary title of the video - `alt_title` (string): A secondary title of the video
- `display_id`: An alternative identifier for the video - `display_id` (string): An alternative identifier for the video
- `uploader`: Full name of the video uploader - `uploader` (string): Full name of the video uploader
- `license`: License name the video is licensed under - `license` (string): License name the video is licensed under
- `creator`: The creator of the video - `creator` (string): The creator of the video
- `release_date`: The date (YYYYMMDD) when the video was released - `release_date` (string): The date (YYYYMMDD) when the video was released
- `timestamp`: UNIX timestamp of the moment the video became available - `timestamp` (numeric): UNIX timestamp of the moment the video became available
- `upload_date`: Video upload date (YYYYMMDD) - `upload_date` (string): Video upload date (YYYYMMDD)
- `uploader_id`: Nickname or id of the video uploader - `uploader_id` (string): Nickname or id of the video uploader
- `location`: Physical location where the video was filmed - `location` (string): Physical location where the video was filmed
- `duration`: Length of the video in seconds - `duration` (numeric): Length of the video in seconds
- `view_count`: How many users have watched the video on the platform - `view_count` (numeric): How many users have watched the video on the platform
- `like_count`: Number of positive ratings of the video - `like_count` (numeric): Number of positive ratings of the video
- `dislike_count`: Number of negative ratings of the video - `dislike_count` (numeric): Number of negative ratings of the video
- `repost_count`: Number of reposts of the video - `repost_count` (numeric): Number of reposts of the video
- `average_rating`: Average rating give by users, the scale used depends on the webpage - `average_rating` (numeric): Average rating give by users, the scale used depends on the webpage
- `comment_count`: Number of comments on the video - `comment_count` (numeric): Number of comments on the video
- `age_limit`: Age restriction for the video (years) - `age_limit` (numeric): Age restriction for the video (years)
- `format`: A human-readable description of the format - `format` (string): A human-readable description of the format
- `format_id`: Format code specified by `--format` - `format_id` (string): Format code specified by `--format`
- `format_note`: Additional info about the format - `format_note` (string): Additional info about the format
- `width`: Width of the video - `width` (numeric): Width of the video
- `height`: Height of the video - `height` (numeric): Height of the video
- `resolution`: Textual description of width and height - `resolution` (string): Textual description of width and height
- `tbr`: Average bitrate of audio and video in KBit/s - `tbr` (numeric): Average bitrate of audio and video in KBit/s
- `abr`: Average audio bitrate in KBit/s - `abr` (numeric): Average audio bitrate in KBit/s
- `acodec`: Name of the audio codec in use - `acodec` (string): Name of the audio codec in use
- `asr`: Audio sampling rate in Hertz - `asr` (numeric): Audio sampling rate in Hertz
- `vbr`: Average video bitrate in KBit/s - `vbr` (numeric): Average video bitrate in KBit/s
- `fps`: Frame rate - `fps` (numeric): Frame rate
- `vcodec`: Name of the video codec in use - `vcodec` (string): Name of the video codec in use
- `container`: Name of the container format - `container` (string): Name of the container format
- `filesize`: The number of bytes, if known in advance - `filesize` (numeric): The number of bytes, if known in advance
- `filesize_approx`: An estimate for the number of bytes - `filesize_approx` (numeric): An estimate for the number of bytes
- `protocol`: The protocol that will be used for the actual download - `protocol` (string): The protocol that will be used for the actual download
- `extractor`: Name of the extractor - `extractor` (string): Name of the extractor
- `extractor_key`: Key name of the extractor - `extractor_key` (string): Key name of the extractor
- `epoch`: Unix epoch when creating the file - `epoch` (numeric): Unix epoch when creating the file
- `autonumber`: Five-digit number that will be increased with each download, starting at zero - `autonumber` (numeric): Five-digit number that will be increased with each download, starting at zero
- `playlist`: Name or id of the playlist that contains the video - `playlist` (string): Name or id of the playlist that contains the video
- `playlist_index`: Index of the video in the playlist padded with leading zeros according to the total length of the playlist - `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist
- `playlist_id`: Playlist identifier - `playlist_id` (string): Playlist identifier
- `playlist_title`: Playlist title - `playlist_title` (string): Playlist title
Available for the video that belongs to some logical chapter or section: Available for the video that belongs to some logical chapter or section:
- `chapter`: Name or title of the chapter the video belongs to - `chapter` (string): Name or title of the chapter the video belongs to
- `chapter_number`: Number of the chapter the video belongs to - `chapter_number` (numeric): Number of the chapter the video belongs to
- `chapter_id`: Id of the chapter the video belongs to - `chapter_id` (string): Id of the chapter the video belongs to
Available for the video that is an episode of some series or programme: Available for the video that is an episode of some series or programme:
- `series`: Title of the series or programme the video episode belongs to - `series` (string): Title of the series or programme the video episode belongs to
- `season`: Title of the season the video episode belongs to - `season` (string): Title of the season the video episode belongs to
- `season_number`: Number of the season the video episode belongs to - `season_number` (numeric): Number of the season the video episode belongs to
- `season_id`: Id of the season the video episode belongs to - `season_id` (string): Id of the season the video episode belongs to
- `episode`: Title of the video episode - `episode` (string): Title of the video episode
- `episode_number`: Number of the video episode within a season - `episode_number` (numeric): Number of the video episode within a season
- `episode_id`: Id of the video episode - `episode_id` (string): Id of the video episode
Available for the media that is a track or a part of a music album: Available for the media that is a track or a part of a music album:
- `track`: Title of the track - `track` (string): Title of the track
- `track_number`: Number of the track within an album or a disc - `track_number` (numeric): Number of the track within an album or a disc
- `track_id`: Id of the track - `track_id` (string): Id of the track
- `artist`: Artist(s) of the track - `artist` (string): Artist(s) of the track
- `genre`: Genre(s) of the track - `genre` (string): Genre(s) of the track
- `album`: Title of the album the track belongs to - `album` (string): Title of the album the track belongs to
- `album_type`: Type of the album - `album_type` (string): Type of the album
- `album_artist`: List of all artists appeared on the album - `album_artist` (string): List of all artists appeared on the album
- `disc_number`: Number of the disc or other physical medium the track belongs to - `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
- `release_year`: Year (YYYY) when the album was released - `release_year` (numeric): Year (YYYY) when the album was released
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with `NA`. Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with `NA`.
For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory. For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.
For numeric sequences you can use numeric related formatting, for example, `%(view_count)05d` will result in a string with view count padded with zeros up to 5 characters, like in `00042`.
Output templates can also contain arbitrary hierarchical path, e.g. `-o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s'` which will result in downloading each video in a directory corresponding to this path template. Any missing directory will be automatically created for you. Output templates can also contain arbitrary hierarchical path, e.g. `-o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s'` which will result in downloading each video in a directory corresponding to this path template. Any missing directory will be automatically created for you.
To use percent literals in an output template use `%%`. To output to stdout use `-o -`. To use percent literals in an output template use `%%`. To output to stdout use `-o -`.

View File

@ -1,6 +1,7 @@
from __future__ import unicode_literals, print_function from __future__ import unicode_literals, print_function
from inspect import getsource from inspect import getsource
import io
import os import os
from os.path import dirname as dirn from os.path import dirname as dirn
import sys import sys
@ -95,5 +96,5 @@ module_contents.append(
module_src = '\n'.join(module_contents) + '\n' module_src = '\n'.join(module_contents) + '\n'
with open(lazy_extractors_filename, 'wt') as f: with io.open(lazy_extractors_filename, 'wt', encoding='utf-8') as f:
f.write(module_src) f.write(module_src)

View File

@ -1,6 +1,6 @@
#!/bin/bash #!/bin/bash
DOWNLOAD_TESTS="age_restriction|download|subtitles|write_annotations|iqiyi_sdk_interpreter" DOWNLOAD_TESTS="age_restriction|download|subtitles|write_annotations|iqiyi_sdk_interpreter|youtube_lists"
test_set="" test_set=""
multiprocess_args="" multiprocess_args=""

View File

@ -680,6 +680,7 @@
- **Shared**: shared.sx - **Shared**: shared.sx
- **ShowRoomLive** - **ShowRoomLive**
- **Sina** - **Sina**
- **SkylineWebcams**
- **skynewsarabia:article** - **skynewsarabia:article**
- **skynewsarabia:video** - **skynewsarabia:video**
- **SkySports** - **SkySports**
@ -804,6 +805,7 @@
- **TVCArticle** - **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru - **tvigle**: Интернет-телевидение Tvigle.ru
- **tvland.com** - **tvland.com**
- **TVN24**
- **TVNoe** - **TVNoe**
- **tvp**: Telewizja Polska - **tvp**: Telewizja Polska
- **tvp:embed**: Telewizja Polska - **tvp:embed**: Telewizja Polska

View File

@ -107,8 +107,8 @@ setup(
url='https://github.com/rg3/youtube-dl', url='https://github.com/rg3/youtube-dl',
author='Ricardo Garcia', author='Ricardo Garcia',
author_email='ytdl@yt-dl.org', author_email='ytdl@yt-dl.org',
maintainer='Philipp Hagemeister', maintainer='Sergey M.',
maintainer_email='phihag@phihag.de', maintainer_email='dstftw@gmail.com',
packages=[ packages=[
'youtube_dl', 'youtube_dl',
'youtube_dl.extractor', 'youtube_dl.downloader', 'youtube_dl.extractor', 'youtube_dl.downloader',
@ -130,6 +130,7 @@ setup(
'Programming Language :: Python :: 3.3', 'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4', 'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
], ],
cmdclass={'build_lazy_extractors': build_lazy_extractors}, cmdclass={'build_lazy_extractors': build_lazy_extractors},

View File

@ -526,6 +526,7 @@ class TestYoutubeDL(unittest.TestCase):
'id': '1234', 'id': '1234',
'ext': 'mp4', 'ext': 'mp4',
'width': None, 'width': None,
'height': 1080,
} }
def fname(templ): def fname(templ):
@ -535,6 +536,19 @@ class TestYoutubeDL(unittest.TestCase):
self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4') self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4')
# Replace missing fields with 'NA' # Replace missing fields with 'NA'
self.assertEqual(fname('%(uploader_date)s-%(id)s.%(ext)s'), 'NA-1234.mp4') self.assertEqual(fname('%(uploader_date)s-%(id)s.%(ext)s'), 'NA-1234.mp4')
self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4')
self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4')
self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4')
self.assertEqual(fname('%(height)06d.%(ext)s'), '001080.mp4')
self.assertEqual(fname('%(height) 06d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%(height) 06d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%(height) 0 6d.%(ext)s'), ' 01080.mp4')
self.assertEqual(fname('%%(height)06d.%(ext)s'), '%(height)06d.mp4')
self.assertEqual(fname('%(width)06d.%(ext)s'), 'NA.mp4')
self.assertEqual(fname('%(width)06d.%%(ext)s'), 'NA.%(ext)s')
self.assertEqual(fname('%%(width)06d.%(ext)s'), '%(width)06d.mp4')
def test_format_note(self): def test_format_note(self):
ydl = YoutubeDL() ydl = YoutubeDL()

View File

@ -33,6 +33,7 @@ from .compat import (
compat_get_terminal_size, compat_get_terminal_size,
compat_http_client, compat_http_client,
compat_kwargs, compat_kwargs,
compat_numeric_types,
compat_os_name, compat_os_name,
compat_str, compat_str,
compat_tokenize_tokenize, compat_tokenize_tokenize,
@ -327,11 +328,21 @@ class YoutubeDL(object):
self.params.update(params) self.params.update(params)
self.cache = Cache(self) self.cache = Cache(self)
if self.params.get('cn_verification_proxy') is not None: def check_deprecated(param, option, suggestion):
self.report_warning('--cn-verification-proxy is deprecated. Use --geo-verification-proxy instead.') if self.params.get(param) is not None:
self.report_warning(
'%s is deprecated. Use %s instead.' % (option, suggestion))
return True
return False
if check_deprecated('cn_verification_proxy', '--cn-verification-proxy', '--geo-verification-proxy'):
if self.params.get('geo_verification_proxy') is None: if self.params.get('geo_verification_proxy') is None:
self.params['geo_verification_proxy'] = self.params['cn_verification_proxy'] self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
check_deprecated('autonumber_size', '--autonumber-size', 'output template with %(autonumber)0Nd, where N in the number of digits')
check_deprecated('autonumber', '--auto-number', '-o "%(autonumber)s-%(title)s.%(ext)s"')
check_deprecated('usetitle', '--title', '-o "%(title)s-%(id)s.%(ext)s"')
if params.get('bidi_workaround', False): if params.get('bidi_workaround', False):
try: try:
import pty import pty
@ -593,10 +604,7 @@ class YoutubeDL(object):
autonumber_size = self.params.get('autonumber_size') autonumber_size = self.params.get('autonumber_size')
if autonumber_size is None: if autonumber_size is None:
autonumber_size = 5 autonumber_size = 5
autonumber_templ = '%0' + str(autonumber_size) + 'd' template_dict['autonumber'] = self.params.get('autonumber_start', 1) - 1 + self._num_downloads
template_dict['autonumber'] = autonumber_templ % (self.params.get('autonumber_start', 1) - 1 + self._num_downloads)
if template_dict.get('playlist_index') is not None:
template_dict['playlist_index'] = '%0*d' % (len(str(template_dict['n_entries'])), template_dict['playlist_index'])
if template_dict.get('resolution') is None: if template_dict.get('resolution') is None:
if template_dict.get('width') and template_dict.get('height'): if template_dict.get('width') and template_dict.get('height'):
template_dict['resolution'] = '%dx%d' % (template_dict['width'], template_dict['height']) template_dict['resolution'] = '%dx%d' % (template_dict['width'], template_dict['height'])
@ -609,12 +617,61 @@ class YoutubeDL(object):
compat_str(v), compat_str(v),
restricted=self.params.get('restrictfilenames'), restricted=self.params.get('restrictfilenames'),
is_id=(k == 'id')) is_id=(k == 'id'))
template_dict = dict((k, sanitize(k, v)) template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items() for k, v in template_dict.items()
if v is not None and not isinstance(v, (list, tuple, dict))) if v is not None and not isinstance(v, (list, tuple, dict)))
template_dict = collections.defaultdict(lambda: 'NA', template_dict) template_dict = collections.defaultdict(lambda: 'NA', template_dict)
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL) outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
# For fields playlist_index and autonumber convert all occurrences
# of %(field)s to %(field)0Nd for backward compatibility
field_size_compat_map = {
'playlist_index': len(str(template_dict['n_entries'])),
'autonumber': autonumber_size,
}
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
mobj = re.search(FIELD_SIZE_COMPAT_RE, outtmpl)
if mobj:
outtmpl = re.sub(
FIELD_SIZE_COMPAT_RE,
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
outtmpl)
NUMERIC_FIELDS = set((
'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
'upload_year', 'upload_month', 'upload_day',
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
'average_rating', 'comment_count', 'age_limit',
'start_time', 'end_time',
'chapter_number', 'season_number', 'episode_number',
'track_number', 'disc_number', 'release_year',
'playlist_index',
))
# Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since
# string 'NA' is returned for missing fields. We will patch output
# template for missing fields to meet string presentation type.
for numeric_field in NUMERIC_FIELDS:
if numeric_field not in template_dict:
# As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
FORMAT_RE = r'''(?x)
(?<!%)
%
\({0}\) # mapping key
(?:[#0\-+ ]+)? # conversion flags (optional)
(?:\d+)? # minimum field width (optional)
(?:\.\d+)? # precision (optional)
[hlL]? # length modifier (optional)
[diouxXeEfFgGcrs%] # conversion type
'''
outtmpl = re.sub(
FORMAT_RE.format(numeric_field),
r'%({0})s'.format(numeric_field), outtmpl)
tmpl = compat_expanduser(outtmpl) tmpl = compat_expanduser(outtmpl)
filename = tmpl % template_dict filename = tmpl % template_dict
# Temporary fix for #4787 # Temporary fix for #4787

View File

@ -416,6 +416,9 @@ def _real_main(argv=None):
'config_location': opts.config_location, 'config_location': opts.config_location,
'geo_bypass': opts.geo_bypass, 'geo_bypass': opts.geo_bypass,
'geo_bypass_country': opts.geo_bypass_country, 'geo_bypass_country': opts.geo_bypass_country,
# just for deprecation check
'autonumber': opts.autonumber if opts.autonumber is True else None,
'usetitle': opts.usetitle if opts.usetitle is True else None,
} }
with YoutubeDL(ydl_opts) as ydl: with YoutubeDL(ydl_opts) as ydl:

View File

@ -2760,6 +2760,12 @@ else:
compat_kwargs = lambda kwargs: kwargs compat_kwargs = lambda kwargs: kwargs
try:
compat_numeric_types = (int, float, long, complex)
except NameError: # Python 3
compat_numeric_types = (int, float, complex)
if sys.version_info < (2, 7): if sys.version_info < (2, 7):
def compat_socket_create_connection(address, timeout, source_address=None): def compat_socket_create_connection(address, timeout, source_address=None):
host, port = address host, port = address
@ -2895,6 +2901,7 @@ __all__ = [
'compat_input', 'compat_input',
'compat_itertools_count', 'compat_itertools_count',
'compat_kwargs', 'compat_kwargs',
'compat_numeric_types',
'compat_ord', 'compat_ord',
'compat_os_name', 'compat_os_name',
'compat_parse_qs', 'compat_parse_qs',

View File

@ -347,7 +347,10 @@ class FileDownloader(object):
if min_sleep_interval: if min_sleep_interval:
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval) max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval) sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
self.to_screen('[download] Sleeping %s seconds...' % sleep_interval) self.to_screen(
'[download] Sleeping %s seconds...' % (
int(sleep_interval) if sleep_interval.is_integer()
else '%.2f' % sleep_interval))
time.sleep(sleep_interval) time.sleep(sleep_interval)
return self.real_download(filename, info_dict) return self.real_download(filename, info_dict)

View File

@ -31,6 +31,11 @@ MSO_INFO = {
'username_field': 'user', 'username_field': 'user',
'password_field': 'passwd', 'password_field': 'passwd',
}, },
'TWC': {
'name': 'Time Warner Cable | Spectrum',
'username_field': 'Ecom_User_ID',
'password_field': 'Ecom_Password',
},
'thr030': { 'thr030': {
'name': '3 Rivers Communications' 'name': '3 Rivers Communications'
}, },

View File

@ -10,7 +10,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE): class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1', 'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '', 'md5': '',
@ -44,6 +44,12 @@ class AMCNetworksIE(ThePlatformIE):
}, { }, {
'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version', 'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://www.wetv.com/shows/mama-june-from-not-to-hot/full-episode/season-01/thin-tervention',
'only_matching': True,
}, {
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -1,6 +1,7 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import codecs
import re import re
from .common import InfoExtractor from .common import InfoExtractor
@ -96,6 +97,10 @@ class CDAIE(InfoExtractor):
if not video or 'file' not in video: if not video or 'file' not in video:
self.report_warning('Unable to extract %s version information' % version) self.report_warning('Unable to extract %s version information' % version)
return return
if video['file'].startswith('uggc'):
video['file'] = codecs.decode(video['file'], 'rot_13')
if video['file'].endswith('adc.mp4'):
video['file'] = video['file'].replace('adc.mp4', '.mp4')
f = { f = {
'url': video['file'], 'url': video['file'],
} }

View File

@ -333,6 +333,9 @@ class InfoExtractor(object):
geo restriction bypass mechanism right away in order to bypass geo restriction bypass mechanism right away in order to bypass
geo restriction, of course, if the mechanism is not disabled. (experimental) geo restriction, of course, if the mechanism is not disabled. (experimental)
NB: both these geo attributes are experimental and may change in future
or be completely removed.
Finally, the _WORKING attribute should be set to False for broken IEs Finally, the _WORKING attribute should be set to False for broken IEs
in order to warn the users and skip the tests. in order to warn the users and skip the tests.
""" """
@ -376,12 +379,28 @@ class InfoExtractor(object):
def initialize(self): def initialize(self):
"""Initializes an instance (authentication, etc).""" """Initializes an instance (authentication, etc)."""
self.__initialize_geo_bypass() self._initialize_geo_bypass(self._GEO_COUNTRIES)
if not self._ready: if not self._ready:
self._real_initialize() self._real_initialize()
self._ready = True self._ready = True
def __initialize_geo_bypass(self): def _initialize_geo_bypass(self, countries):
"""
Initialize geo restriction bypass mechanism.
This method is used to initialize geo bypass mechanism based on faking
X-Forwarded-For HTTP header. A random country from provided country list
is selected and a random IP belonging to this country is generated. This
IP will be passed as X-Forwarded-For HTTP header in all subsequent
HTTP requests.
This method will be used for initial geo bypass mechanism initialization
during the instance initialization with _GEO_COUNTRIES.
You may also manually call it from extractor's code if geo countries
information is not available beforehand (e.g. obtained during
extraction) or due to some another reason.
"""
if not self._x_forwarded_for_ip: if not self._x_forwarded_for_ip:
country_code = self._downloader.params.get('geo_bypass_country', None) country_code = self._downloader.params.get('geo_bypass_country', None)
# If there is no explicit country for geo bypass specified and # If there is no explicit country for geo bypass specified and
@ -390,13 +409,14 @@ class InfoExtractor(object):
if (not country_code and if (not country_code and
self._GEO_BYPASS and self._GEO_BYPASS and
self._downloader.params.get('geo_bypass', True) and self._downloader.params.get('geo_bypass', True) and
self._GEO_COUNTRIES): countries):
country_code = random.choice(self._GEO_COUNTRIES) country_code = random.choice(countries)
if country_code: if country_code:
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code) self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
if self._downloader.params.get('verbose', False): if self._downloader.params.get('verbose', False):
self._downloader.to_stdout( self._downloader.to_stdout(
'[debug] Using fake %s IP as X-Forwarded-For.' % self._x_forwarded_for_ip) '[debug] Using fake IP %s (%s) as X-Forwarded-For.'
% (self._x_forwarded_for_ip, country_code.upper()))
def extract(self, url): def extract(self, url):
"""Extracts URL information and returns it in list of dicts.""" """Extracts URL information and returns it in list of dicts."""
@ -425,10 +445,12 @@ class InfoExtractor(object):
self._downloader.params.get('geo_bypass', True) and self._downloader.params.get('geo_bypass', True) and
not self._x_forwarded_for_ip and not self._x_forwarded_for_ip and
countries): countries):
self._x_forwarded_for_ip = GeoUtils.random_ipv4(random.choice(countries)) country_code = random.choice(countries)
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
if self._x_forwarded_for_ip: if self._x_forwarded_for_ip:
self.report_warning( self.report_warning(
'Video is geo restricted. Retrying extraction with fake %s IP as X-Forwarded-For.' % self._x_forwarded_for_ip) 'Video is geo restricted. Retrying extraction with fake IP %s (%s) as X-Forwarded-For.'
% (self._x_forwarded_for_ip, country_code.upper()))
return True return True
return False return False
@ -1988,7 +2010,7 @@ class InfoExtractor(object):
}) })
return formats return formats
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None): def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
def absolute_url(video_url): def absolute_url(video_url):
return compat_urlparse.urljoin(base_url, video_url) return compat_urlparse.urljoin(base_url, video_url)
@ -2010,7 +2032,8 @@ class InfoExtractor(object):
is_plain_url = False is_plain_url = False
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
full_url, video_id, ext='mp4', full_url, video_id, ext='mp4',
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id) entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id,
preference=preference)
elif ext == 'mpd': elif ext == 'mpd':
is_plain_url = False is_plain_url = False
formats = self._extract_mpd_formats( formats = self._extract_mpd_formats(

View File

@ -6,6 +6,7 @@ from ..utils import int_or_none
class CrackleIE(InfoExtractor): class CrackleIE(InfoExtractor):
_GEO_COUNTRIES = ['US']
_VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)' _VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
_TEST = { _TEST = {
'url': 'http://www.crackle.com/comedians-in-cars-getting-coffee/2498934', 'url': 'http://www.crackle.com/comedians-in-cars-getting-coffee/2498934',

View File

@ -123,7 +123,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513', 'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
'info_dict': { 'info_dict': {
'id': '645513', 'id': '645513',
'ext': 'flv', 'ext': 'mp4',
'title': 'Wanna be the Strongest in the World Episode 1 An Idol-Wrestler is Born!', 'title': 'Wanna be the Strongest in the World Episode 1 An Idol-Wrestler is Born!',
'description': 'md5:2d17137920c64f2f49981a7797d275ef', 'description': 'md5:2d17137920c64f2f49981a7797d275ef',
'thumbnail': 'http://img1.ak.crunchyroll.com/i/spire1-tmb/20c6b5e10f1a47b10516877d3c039cae1380951166_full.jpg', 'thumbnail': 'http://img1.ak.crunchyroll.com/i/spire1-tmb/20c6b5e10f1a47b10516877d3c039cae1380951166_full.jpg',
@ -192,6 +192,36 @@ class CrunchyrollIE(CrunchyrollBaseIE):
# geo-restricted (US), 18+ maturity wall, non-premium available # geo-restricted (US), 18+ maturity wall, non-premium available
'url': 'http://www.crunchyroll.com/cosplay-complex-ova/episode-1-the-birth-of-the-cosplay-club-565617', 'url': 'http://www.crunchyroll.com/cosplay-complex-ova/episode-1-the-birth-of-the-cosplay-club-565617',
'only_matching': True, 'only_matching': True,
}, {
# A description with double quotes
'url': 'http://www.crunchyroll.com/11eyes/episode-1-piros-jszaka-red-night-535080',
'info_dict': {
'id': '535080',
'ext': 'mp4',
'title': '11eyes Episode 1 Piros éjszaka - Red Night',
'description': 'Kakeru and Yuka are thrown into an alternate nightmarish world they call "Red Night".',
'uploader': 'Marvelous AQL Inc.',
'upload_date': '20091021',
},
'params': {
# Just test metadata extraction
'skip_download': True,
},
}, {
# make sure we can extract an uploader name that's not a link
'url': 'http://www.crunchyroll.com/hakuoki-reimeiroku/episode-1-dawn-of-the-divine-warriors-606899',
'info_dict': {
'id': '606899',
'ext': 'mp4',
'title': 'Hakuoki Reimeiroku Episode 1 Dawn of the Divine Warriors',
'description': 'Ryunosuke was left to die, but Serizawa-san asked him a simple question "Do you want to live?"',
'uploader': 'Geneon Entertainment',
'upload_date': '20120717',
},
'params': {
# just test metadata extraction
'skip_download': True,
},
}] }]
_FORMAT_IDS = { _FORMAT_IDS = {
@ -362,9 +392,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
r'(?s)<h1[^>]*>((?:(?!<h1).)*?<span[^>]+itemprop=["\']title["\'][^>]*>(?:(?!<h1).)+?)</h1>', r'(?s)<h1[^>]*>((?:(?!<h1).)*?<span[^>]+itemprop=["\']title["\'][^>]*>(?:(?!<h1).)+?)</h1>',
webpage, 'video_title') webpage, 'video_title')
video_title = re.sub(r' {2,}', ' ', video_title) video_title = re.sub(r' {2,}', ' ', video_title)
video_description = self._html_search_regex( video_description = self._parse_json(self._html_search_regex(
r'<script[^>]*>\s*.+?\[media_id=%s\].+?"description"\s*:\s*"([^"]+)' % video_id, r'<script[^>]*>\s*.+?\[media_id=%s\].+?({.+?"description"\s*:.+?})\);' % video_id,
webpage, 'description', default=None) webpage, 'description', default='{}'), video_id).get('description')
if video_description: if video_description:
video_description = lowercase_escape(video_description.replace(r'\r\n', '\n')) video_description = lowercase_escape(video_description.replace(r'\r\n', '\n'))
video_upload_date = self._html_search_regex( video_upload_date = self._html_search_regex(
@ -373,8 +403,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
if video_upload_date: if video_upload_date:
video_upload_date = unified_strdate(video_upload_date) video_upload_date = unified_strdate(video_upload_date)
video_uploader = self._html_search_regex( video_uploader = self._html_search_regex(
r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', webpage, # try looking for both an uploader that's a link and one that's not
'video_uploader', fatal=False) [r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
webpage, 'video_uploader', fatal=False)
available_fmts = [] available_fmts = []
for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage): for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage):
@ -519,11 +550,11 @@ class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE):
r'(?s)<h1[^>]*>\s*<span itemprop="name">(.*?)</span>', r'(?s)<h1[^>]*>\s*<span itemprop="name">(.*?)</span>',
webpage, 'title') webpage, 'title')
episode_paths = re.findall( episode_paths = re.findall(
r'(?s)<li id="showview_videos_media_[0-9]+"[^>]+>.*?<a href="([^"]+)"', r'(?s)<li id="showview_videos_media_(\d+)"[^>]+>.*?<a href="([^"]+)"',
webpage) webpage)
entries = [ entries = [
self.url_result('http://www.crunchyroll.com' + ep, 'Crunchyroll') self.url_result('http://www.crunchyroll.com' + ep, 'Crunchyroll', ep_id)
for ep in episode_paths for ep_id, ep in episode_paths
] ]
entries.reverse() entries.reverse()

View File

@ -66,7 +66,6 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'uploader_id': 'xijv66', 'uploader_id': 'xijv66',
'age_limit': 0, 'age_limit': 0,
'view_count': int, 'view_count': int,
'comment_count': int,
} }
}, },
# Vevo video # Vevo video
@ -140,7 +139,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
view_count = str_to_int(view_count_str) view_count = str_to_int(view_count_str)
comment_count = int_or_none(self._search_regex( comment_count = int_or_none(self._search_regex(
r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserComments:(\d+)"', r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserComments:(\d+)"',
webpage, 'comment count', fatal=False)) webpage, 'comment count', default=None))
player_v5 = self._search_regex( player_v5 = self._search_regex(
[r'buildPlayer\(({.+?})\);\n', # See https://github.com/rg3/youtube-dl/issues/7826 [r'buildPlayer\(({.+?})\);\n', # See https://github.com/rg3/youtube-dl/issues/7826
@ -283,9 +282,14 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
} }
def _check_error(self, info): def _check_error(self, info):
error = info.get('error')
if info.get('error') is not None: if info.get('error') is not None:
title = error['title']
# See https://developer.dailymotion.com/api#access-error
if error.get('code') == 'DM007':
self.raise_geo_restricted(msg=title)
raise ExtractorError( raise ExtractorError(
'%s said: %s' % (self.IE_NAME, info['error']['title']), expected=True) '%s said: %s' % (self.IE_NAME, title), expected=True)
def _get_subtitles(self, video_id, webpage): def _get_subtitles(self, video_id, webpage):
try: try:

View File

@ -0,0 +1,39 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class ETOnlineIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?etonline\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.etonline.com/tv/211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale/',
'info_dict': {
'id': '211130_dove_cameron_liv_and_maddie_emotional_episode_series_finale',
'title': 'md5:a21ec7d3872ed98335cbd2a046f34ee6',
'description': 'md5:8b94484063f463cca709617c79618ccd',
},
'playlist_count': 2,
}, {
'url': 'http://www.etonline.com/media/video/here_are_the_stars_who_love_bringing_their_moms_as_dates_to_the_oscars-211359/',
'only_matching': True,
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1242911076001/default_default/index.html?videoId=ref:%s'
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = [
self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % video_id, 'BrightcoveNew', video_id)
for video_id in re.findall(
r'site\.brightcove\s*\([^,]+,\s*["\'](title_\d+)', webpage)]
return self.playlist_result(
entries, playlist_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage))

View File

@ -288,6 +288,7 @@ from .espn import (
ESPNArticleIE, ESPNArticleIE,
) )
from .esri import EsriVideoIE from .esri import EsriVideoIE
from .etonline import ETOnlineIE
from .europa import EuropaIE from .europa import EuropaIE
from .everyonesmixtape import EveryonesMixtapeIE from .everyonesmixtape import EveryonesMixtapeIE
from .expotv import ExpoTVIE from .expotv import ExpoTVIE
@ -338,6 +339,7 @@ from .francetv import (
) )
from .freesound import FreesoundIE from .freesound import FreesoundIE
from .freespeech import FreespeechIE from .freespeech import FreespeechIE
from .freshlive import FreshLiveIE
from .funimation import FunimationIE from .funimation import FunimationIE
from .funnyordie import FunnyOrDieIE from .funnyordie import FunnyOrDieIE
from .fusion import FusionIE from .fusion import FusionIE
@ -637,6 +639,7 @@ from .ninecninemedia import (
from .ninegag import NineGagIE from .ninegag import NineGagIE
from .ninenow import NineNowIE from .ninenow import NineNowIE
from .nintendo import NintendoIE from .nintendo import NintendoIE
from .njpwworld import NJPWWorldIE
from .nobelprize import NobelPrizeIE from .nobelprize import NobelPrizeIE
from .noco import NocoIE from .noco import NocoIE
from .normalboots import NormalbootsIE from .normalboots import NormalbootsIE
@ -852,6 +855,7 @@ from .shared import (
from .showroomlive import ShowRoomLiveIE from .showroomlive import ShowRoomLiveIE
from .sina import SinaIE from .sina import SinaIE
from .sixplay import SixPlayIE from .sixplay import SixPlayIE
from .skylinewebcams import SkylineWebcamsIE
from .skynewsarabia import ( from .skynewsarabia import (
SkyNewsArabiaIE, SkyNewsArabiaIE,
SkyNewsArabiaArticleIE, SkyNewsArabiaArticleIE,

View File

@ -0,0 +1,84 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
try_get,
unified_timestamp,
)
class FreshLiveIE(InfoExtractor):
_VALID_URL = r'https?://freshlive\.tv/[^/]+/(?P<id>\d+)'
_TEST = {
'url': 'https://freshlive.tv/satotv/74712',
'md5': '9f0cf5516979c4454ce982df3d97f352',
'info_dict': {
'id': '74712',
'ext': 'mp4',
'title': 'テスト',
'description': 'テスト',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1511,
'timestamp': 1483619655,
'upload_date': '20170105',
'uploader': 'サトTV',
'uploader_id': 'satotv',
'view_count': int,
'comment_count': int,
'is_live': False,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
options = self._parse_json(
self._search_regex(
r'window\.__CONTEXT__\s*=\s*({.+?});\s*</script>',
webpage, 'initial context'),
video_id)
info = options['context']['dispatcher']['stores']['ProgramStore']['programs'][video_id]
title = info['title']
if info.get('status') == 'upcoming':
raise ExtractorError('Stream %s is upcoming' % video_id, expected=True)
stream_url = info.get('liveStreamUrl') or info['archiveStreamUrl']
is_live = info.get('liveStreamUrl') is not None
formats = self._extract_m3u8_formats(
stream_url, video_id, ext='mp4',
entry_protocol='m3u8' if is_live else 'm3u8_native',
m3u8_id='hls')
if is_live:
title = self._live_title(title)
return {
'id': video_id,
'formats': formats,
'title': title,
'description': info.get('description'),
'thumbnail': info.get('thumbnailUrl'),
'duration': int_or_none(info.get('airTime')),
'timestamp': unified_timestamp(info.get('createdAt')),
'uploader': try_get(
info, lambda x: x['channel']['title'], compat_str),
'uploader_id': try_get(
info, lambda x: x['channel']['code'], compat_str),
'uploader_url': try_get(
info, lambda x: x['channel']['permalink'], compat_str),
'view_count': int_or_none(info.get('viewCount')),
'comment_count': int_or_none(info.get('commentCount')),
'tags': info.get('tags', []),
'is_live': is_live,
}

View File

@ -37,7 +37,6 @@ class GoIE(AdobePassIE):
} }
} }
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|season-\d+/\d+-(?P<display_id>[^/?#]+))' % '|'.join(_SITE_INFO.keys()) _VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|season-\d+/\d+-(?P<display_id>[^/?#]+))' % '|'.join(_SITE_INFO.keys())
_GEO_COUNTRIES = ['US']
_TESTS = [{ _TESTS = [{
'url': 'http://abc.go.com/shows/castle/video/most-recent/vdka0_g86w5onx', 'url': 'http://abc.go.com/shows/castle/video/most-recent/vdka0_g86w5onx',
'info_dict': { 'info_dict': {
@ -79,44 +78,60 @@ class GoIE(AdobePassIE):
ext = determine_ext(asset_url) ext = determine_ext(asset_url)
if ext == 'm3u8': if ext == 'm3u8':
video_type = video_data.get('type') video_type = video_data.get('type')
if video_type == 'lf': data = {
data = { 'video_id': video_data['id'],
'video_id': video_data['id'], 'video_type': video_type,
'video_type': video_type, 'brand': brand,
'brand': brand, 'device': '001',
'device': '001', }
} if video_data.get('accesslevel') == '1':
if video_data.get('accesslevel') == '1': requestor_id = site_info['requestor_id']
requestor_id = site_info['requestor_id'] resource = self._get_mvpd_resource(
resource = self._get_mvpd_resource( requestor_id, title, video_id, None)
requestor_id, title, video_id, None) auth = self._extract_mvpd_auth(
auth = self._extract_mvpd_auth( url, video_id, requestor_id, resource)
url, video_id, requestor_id, resource) data.update({
data.update({ 'token': auth,
'token': auth, 'token_type': 'ap',
'token_type': 'ap', 'adobe_requestor_id': requestor_id,
'adobe_requestor_id': requestor_id, })
}) else:
entitlement = self._download_json( self._initialize_geo_bypass(['US'])
'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json', entitlement = self._download_json(
video_id, data=urlencode_postdata(data), headers=self.geo_verification_headers()) 'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json',
errors = entitlement.get('errors', {}).get('errors', []) video_id, data=urlencode_postdata(data), headers=self.geo_verification_headers())
if errors: errors = entitlement.get('errors', {}).get('errors', [])
for error in errors: if errors:
if error.get('code') == 1002: for error in errors:
self.raise_geo_restricted( if error.get('code') == 1002:
error['message'], countries=self._GEO_COUNTRIES) self.raise_geo_restricted(
error_message = ', '.join([error['message'] for error in errors]) error['message'], countries=['US'])
raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message), expected=True) error_message = ', '.join([error['message'] for error in errors])
asset_url += '?' + entitlement['uplynkData']['sessionKey'] raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message), expected=True)
asset_url += '?' + entitlement['uplynkData']['sessionKey']
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
asset_url, video_id, 'mp4', m3u8_id=format_id or 'hls', fatal=False)) asset_url, video_id, 'mp4', m3u8_id=format_id or 'hls', fatal=False))
else: else:
formats.append({ f = {
'format_id': format_id, 'format_id': format_id,
'url': asset_url, 'url': asset_url,
'ext': ext, 'ext': ext,
}) }
if re.search(r'(?:/mp4/source/|_source\.mp4)', asset_url):
f.update({
'format_id': ('%s-' % format_id if format_id else '') + 'SOURCE',
'preference': 1,
})
else:
mobj = re.search(r'/(\d+)x(\d+)/', asset_url)
if mobj:
height = int(mobj.group(2))
f.update({
'format_id': ('%s-' % format_id if format_id else '') + '%dP' % height,
'width': int(mobj.group(1)),
'height': height,
})
formats.append(f)
self._sort_formats(formats) self._sort_formats(formats)
subtitles = {} subtitles = {}

View File

@ -3,6 +3,7 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
get_element_by_attribute, get_element_by_attribute,
int_or_none, int_or_none,
@ -50,6 +51,33 @@ class InstagramIE(InfoExtractor):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, {
# multi video post
'url': 'https://www.instagram.com/p/BQ0eAlwhDrw/',
'playlist': [{
'info_dict': {
'id': 'BQ0dSaohpPW',
'ext': 'mp4',
'title': 'Video 1',
},
}, {
'info_dict': {
'id': 'BQ0dTpOhuHT',
'ext': 'mp4',
'title': 'Video 2',
},
}, {
'info_dict': {
'id': 'BQ0dT7RBFeF',
'ext': 'mp4',
'title': 'Video 3',
},
}],
'info_dict': {
'id': 'BQ0eAlwhDrw',
'title': 'Post by instagram',
'description': 'md5:0f9203fc6a2ce4d228da5754bcf54957',
},
}, { }, {
'url': 'https://instagram.com/p/-Cmh1cukG2/', 'url': 'https://instagram.com/p/-Cmh1cukG2/',
'only_matching': True, 'only_matching': True,
@ -113,6 +141,32 @@ class InstagramIE(InfoExtractor):
'timestamp': int_or_none(comment.get('created_at')), 'timestamp': int_or_none(comment.get('created_at')),
} for comment in media.get( } for comment in media.get(
'comments', {}).get('nodes', []) if comment.get('text')] 'comments', {}).get('nodes', []) if comment.get('text')]
if not video_url:
edges = try_get(
media, lambda x: x['edge_sidecar_to_children']['edges'],
list) or []
if edges:
entries = []
for edge_num, edge in enumerate(edges, start=1):
node = try_get(edge, lambda x: x['node'], dict)
if not node:
continue
node_video_url = try_get(node, lambda x: x['video_url'], compat_str)
if not node_video_url:
continue
entries.append({
'id': node.get('shortcode') or node['id'],
'title': 'Video %d' % edge_num,
'url': node_video_url,
'thumbnail': node.get('display_url'),
'width': int_or_none(try_get(node, lambda x: x['dimensions']['width'])),
'height': int_or_none(try_get(node, lambda x: x['dimensions']['height'])),
'view_count': int_or_none(node.get('video_view_count')),
})
return self.playlist_result(
entries, video_id,
'Post by %s' % uploader_id if uploader_id else None,
description)
if not video_url: if not video_url:
video_url = self._og_search_video_url(webpage, secure=False) video_url = self._og_search_video_url(webpage, secure=False)

View File

@ -8,12 +8,12 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
js_to_json, js_to_json,
sanitized_Request,
) )
class IPrimaIE(InfoExtractor): class IPrimaIE(InfoExtractor):
_VALID_URL = r'https?://play\.iprima\.cz/(?:.+/)?(?P<id>[^?#]+)' _VALID_URL = r'https?://play\.iprima\.cz/(?:.+/)?(?P<id>[^?#]+)'
_GEO_BYPASS = False
_TESTS = [{ _TESTS = [{
'url': 'http://play.iprima.cz/gondici-s-r-o-33', 'url': 'http://play.iprima.cz/gondici-s-r-o-33',
@ -29,6 +29,10 @@ class IPrimaIE(InfoExtractor):
}, { }, {
'url': 'http://play.iprima.cz/particka/particka-92', 'url': 'http://play.iprima.cz/particka/particka-92',
'only_matching': True, 'only_matching': True,
}, {
# geo restricted
'url': 'http://play.iprima.cz/closer-nove-pripady/closer-nove-pripady-iv-1',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -38,11 +42,13 @@ class IPrimaIE(InfoExtractor):
video_id = self._search_regex(r'data-product="([^"]+)">', webpage, 'real id') video_id = self._search_regex(r'data-product="([^"]+)">', webpage, 'real id')
req = sanitized_Request( playerpage = self._download_webpage(
'http://play.iprima.cz/prehravac/init?_infuse=1' 'http://play.iprima.cz/prehravac/init',
'&_ts=%s&productId=%s' % (round(time.time()), video_id)) video_id, note='Downloading player', query={
req.add_header('Referer', url) '_infuse': 1,
playerpage = self._download_webpage(req, video_id, note='Downloading player') '_ts': round(time.time()),
'productId': video_id,
}, headers={'Referer': url})
formats = [] formats = []
@ -82,7 +88,7 @@ class IPrimaIE(InfoExtractor):
extract_formats(src) extract_formats(src)
if not formats and '>GEO_IP_NOT_ALLOWED<' in playerpage: if not formats and '>GEO_IP_NOT_ALLOWED<' in playerpage:
self.raise_geo_restricted() self.raise_geo_restricted(countries=['CZ'])
self._sort_formats(formats) self._sort_formats(formats)

View File

@ -16,6 +16,8 @@ class IviIE(InfoExtractor):
IE_DESC = 'ivi.ru' IE_DESC = 'ivi.ru'
IE_NAME = 'ivi' IE_NAME = 'ivi'
_VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['RU']
_TESTS = [ _TESTS = [
# Single movie # Single movie
@ -91,7 +93,11 @@ class IviIE(InfoExtractor):
if 'error' in video_json: if 'error' in video_json:
error = video_json['error'] error = video_json['error']
if error['origin'] == 'NoRedisValidData': origin = error['origin']
if origin == 'NotAllowedForLocation':
self.raise_geo_restricted(
msg=error['message'], countries=self._GEO_COUNTRIES)
elif origin == 'NoRedisValidData':
raise ExtractorError('Video %s does not exist' % video_id, expected=True) raise ExtractorError('Video %s does not exist' % video_id, expected=True)
raise ExtractorError( raise ExtractorError(
'Unable to download video %s: %s' % (video_id, error['message']), 'Unable to download video %s: %s' % (video_id, error['message']),

View File

@ -30,7 +30,7 @@ from ..utils import (
class LeIE(InfoExtractor): class LeIE(InfoExtractor):
IE_DESC = '乐视网' IE_DESC = '乐视网'
_VALID_URL = r'https?://(?:www\.le\.com/ptv/vplay|(?:sports\.le|(?:www\.)?lesports)\.com/(?:match|video))/(?P<id>\d+)\.html' _VALID_URL = r'https?://(?:www\.le\.com/ptv/vplay|(?:sports\.le|(?:www\.)?lesports)\.com/(?:match|video))/(?P<id>\d+)\.html'
_GEO_COUNTRIES = ['CN']
_URL_TEMPLATE = 'http://www.le.com/ptv/vplay/%s.html' _URL_TEMPLATE = 'http://www.le.com/ptv/vplay/%s.html'
_TESTS = [{ _TESTS = [{
@ -126,10 +126,9 @@ class LeIE(InfoExtractor):
if playstatus['status'] == 0: if playstatus['status'] == 0:
flag = playstatus['flag'] flag = playstatus['flag']
if flag == 1: if flag == 1:
msg = 'Country %s auth error' % playstatus['country'] self.raise_geo_restricted()
else: else:
msg = 'Generic error. flag = %d' % flag raise ExtractorError('Generic error. flag = %d' % flag, expected=True)
raise ExtractorError(msg, expected=True)
def _real_extract(self, url): def _real_extract(self, url):
media_id = self._match_id(url) media_id = self._match_id(url)

View File

@ -4,11 +4,13 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
float_or_none, float_or_none,
int_or_none, int_or_none,
unsmuggle_url, unsmuggle_url,
ExtractorError,
) )
@ -20,9 +22,17 @@ class LimelightBaseIE(InfoExtractor):
headers = {} headers = {}
if referer: if referer:
headers['Referer'] = referer headers['Referer'] = referer
return self._download_json( try:
self._PLAYLIST_SERVICE_URL % (self._PLAYLIST_SERVICE_PATH, item_id, method), return self._download_json(
item_id, 'Downloading PlaylistService %s JSON' % method, fatal=fatal, headers=headers) self._PLAYLIST_SERVICE_URL % (self._PLAYLIST_SERVICE_PATH, item_id, method),
item_id, 'Downloading PlaylistService %s JSON' % method, fatal=fatal, headers=headers)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
error = self._parse_json(e.cause.read().decode(), item_id)['detail']['contentAccessPermission']
if error == 'CountryDisabled':
self.raise_geo_restricted()
raise ExtractorError(error, expected=True)
raise
def _call_api(self, organization_id, item_id, method): def _call_api(self, organization_id, item_id, method):
return self._download_json( return self._download_json(
@ -213,6 +223,7 @@ class LimelightMediaIE(LimelightBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
video_id = self._match_id(url) video_id = self._match_id(url)
self._initialize_geo_bypass(smuggled_data.get('geo_countries'))
pc, mobile, metadata = self._extract( pc, mobile, metadata = self._extract(
video_id, 'getPlaylistByMediaId', video_id, 'getPlaylistByMediaId',

View File

@ -260,9 +260,24 @@ class LyndaCourseIE(LyndaBaseIE):
course_path = mobj.group('coursepath') course_path = mobj.group('coursepath')
course_id = mobj.group('courseid') course_id = mobj.group('courseid')
item_template = 'https://www.lynda.com/%s/%%s-4.html' % course_path
course = self._download_json( course = self._download_json(
'https://www.lynda.com/ajax/player?courseId=%s&type=course' % course_id, 'https://www.lynda.com/ajax/player?courseId=%s&type=course' % course_id,
course_id, 'Downloading course JSON') course_id, 'Downloading course JSON', fatal=False)
if not course:
webpage = self._download_webpage(url, course_id)
entries = [
self.url_result(
item_template % video_id, ie=LyndaIE.ie_key(),
video_id=video_id)
for video_id in re.findall(
r'data-video-id=["\'](\d+)', webpage)]
return self.playlist_result(
entries, course_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage))
if course.get('Status') == 'NotFound': if course.get('Status') == 'NotFound':
raise ExtractorError( raise ExtractorError(
@ -283,7 +298,7 @@ class LyndaCourseIE(LyndaBaseIE):
if video_id: if video_id:
entries.append({ entries.append({
'_type': 'url_transparent', '_type': 'url_transparent',
'url': 'https://www.lynda.com/%s/%s-4.html' % (course_path, video_id), 'url': item_template % video_id,
'ie_key': LyndaIE.ie_key(), 'ie_key': LyndaIE.ie_key(),
'chapter': chapter.get('Title'), 'chapter': chapter.get('Title'),
'chapter_number': int_or_none(chapter.get('ChapterIndex')), 'chapter_number': int_or_none(chapter.get('ChapterIndex')),

View File

@ -14,7 +14,7 @@ from ..utils import (
class MDRIE(InfoExtractor): class MDRIE(InfoExtractor):
IE_DESC = 'MDR.DE and KiKA' IE_DESC = 'MDR.DE and KiKA'
_VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z]+-?(?P<id>\d+)(?:_.+?)?\.html' _VALID_URL = r'https?://(?:www\.)?(?:mdr|kika)\.de/(?:.*)/[a-z-]+-?(?P<id>\d+)(?:_.+?)?\.html'
_TESTS = [{ _TESTS = [{
# MDR regularly deletes its videos # MDR regularly deletes its videos
@ -31,6 +31,7 @@ class MDRIE(InfoExtractor):
'duration': 250, 'duration': 250,
'uploader': 'MITTELDEUTSCHER RUNDFUNK', 'uploader': 'MITTELDEUTSCHER RUNDFUNK',
}, },
'skip': '404 not found',
}, { }, {
'url': 'http://www.kika.de/baumhaus/videos/video19636.html', 'url': 'http://www.kika.de/baumhaus/videos/video19636.html',
'md5': '4930515e36b06c111213e80d1e4aad0e', 'md5': '4930515e36b06c111213e80d1e4aad0e',
@ -41,6 +42,7 @@ class MDRIE(InfoExtractor):
'duration': 134, 'duration': 134,
'uploader': 'KIKA', 'uploader': 'KIKA',
}, },
'skip': '404 not found',
}, { }, {
'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/videos/video8182.html', 'url': 'http://www.kika.de/sendungen/einzelsendungen/weihnachtsprogramm/videos/video8182.html',
'md5': '5fe9c4dd7d71e3b238f04b8fdd588357', 'md5': '5fe9c4dd7d71e3b238f04b8fdd588357',
@ -49,11 +51,21 @@ class MDRIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Beutolomäus und der geheime Weihnachtswunsch', 'title': 'Beutolomäus und der geheime Weihnachtswunsch',
'description': 'md5:b69d32d7b2c55cbe86945ab309d39bbd', 'description': 'md5:b69d32d7b2c55cbe86945ab309d39bbd',
'timestamp': 1450950000, 'timestamp': 1482541200,
'upload_date': '20151224', 'upload_date': '20161224',
'duration': 4628, 'duration': 4628,
'uploader': 'KIKA', 'uploader': 'KIKA',
}, },
}, {
# audio with alternative playerURL pattern
'url': 'http://www.mdr.de/kultur/videos-und-audios/audio-radio/operation-mindfuck-robert-wilson100.html',
'info_dict': {
'id': '100',
'ext': 'mp4',
'title': 'Feature: Operation Mindfuck - Robert Anton Wilson',
'duration': 3239,
'uploader': 'MITTELDEUTSCHER RUNDFUNK',
},
}, { }, {
'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html', 'url': 'http://www.kika.de/baumhaus/sendungen/video19636_zc-fea7f8a0_zs-4bf89c60.html',
'only_matching': True, 'only_matching': True,
@ -71,7 +83,7 @@ class MDRIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
data_url = self._search_regex( data_url = self._search_regex(
r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+/(?:video|audio)-?[0-9]+-avCustom\.xml)\1', r'(?:dataURL|playerXml(?:["\'])?)\s*:\s*(["\'])(?P<url>.+?-avCustom\.xml)\1',
webpage, 'data url', group='url').replace(r'\/', '/') webpage, 'data url', group='url').replace(r'\/', '/')
doc = self._download_xml( doc = self._download_xml(

View File

@ -2,16 +2,17 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import int_or_none from ..utils import int_or_none
class MGTVIE(InfoExtractor): class MGTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?mgtv\.com/v/(?:[^/]+/)*(?P<id>\d+)\.html' _VALID_URL = r'https?://(?:www\.)?mgtv\.com/(v|b)/(?:[^/]+/)*(?P<id>\d+)\.html'
IE_DESC = '芒果TV' IE_DESC = '芒果TV'
_TESTS = [{ _TESTS = [{
'url': 'http://www.mgtv.com/v/1/290525/f/3116640.html', 'url': 'http://www.mgtv.com/v/1/290525/f/3116640.html',
'md5': '1bdadcf760a0b90946ca68ee9a2db41a', 'md5': 'b1ffc0fc163152acf6beaa81832c9ee7',
'info_dict': { 'info_dict': {
'id': '3116640', 'id': '3116640',
'ext': 'mp4', 'ext': 'mp4',
@ -21,48 +22,45 @@ class MGTVIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
}, },
}, { }, {
# no tbr extracted from stream_url 'url': 'http://www.mgtv.com/b/301817/3826653.html',
'url': 'http://www.mgtv.com/v/1/1/f/3324755.html',
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
api_data = self._download_json( api_data = self._download_json(
'http://v.api.mgtv.com/player/video', video_id, 'http://pcweb.api.mgtv.com/player/video', video_id,
query={'video_id': video_id}, query={'video_id': video_id},
headers=self.geo_verification_headers())['data'] headers=self.geo_verification_headers())['data']
info = api_data['info'] info = api_data['info']
title = info['title'].strip()
stream_domain = api_data['stream_domain'][0]
formats = [] formats = []
for idx, stream in enumerate(api_data['stream']): for idx, stream in enumerate(api_data['stream']):
stream_url = stream.get('url') stream_path = stream.get('url')
if not stream_url: if not stream_path:
continue
format_data = self._download_json(
stream_domain + stream_path, video_id,
note='Download video info for format #%d' % idx)
format_url = format_data.get('info')
if not format_url:
continue continue
tbr = int_or_none(self._search_regex( tbr = int_or_none(self._search_regex(
r'(\d+)\.mp4', stream_url, 'tbr', default=None)) r'_(\d+)_mp4/', format_url, 'tbr', default=None))
formats.append({
def extract_format(stream_url, format_id, idx, query={}): 'format_id': compat_str(tbr or idx),
format_info = self._download_json( 'url': format_url,
stream_url, video_id, 'ext': 'mp4',
note='Download video info for format %s' % (format_id or '#%d' % idx), 'tbr': tbr,
query=query) 'protocol': 'm3u8_native',
return { })
'format_id': format_id,
'url': format_info['info'],
'ext': 'mp4',
'tbr': tbr,
}
formats.append(extract_format(
stream_url, 'hls-%d' % tbr if tbr else None, idx * 2))
formats.append(extract_format(stream_url.replace(
'/playlist.m3u8', ''), 'http-%d' % tbr if tbr else None, idx * 2 + 1, {'pno': 1031}))
self._sort_formats(formats) self._sort_formats(formats)
return { return {
'id': video_id, 'id': video_id,
'title': info['title'].strip(), 'title': title,
'formats': formats, 'formats': formats,
'description': info.get('desc'), 'description': info.get('desc'),
'duration': int_or_none(info.get('duration')), 'duration': int_or_none(info.get('duration')),

View File

@ -19,6 +19,7 @@ class NineCNineMediaBaseIE(InfoExtractor):
class NineCNineMediaStackIE(NineCNineMediaBaseIE): class NineCNineMediaStackIE(NineCNineMediaBaseIE):
IE_NAME = '9c9media:stack' IE_NAME = '9c9media:stack'
_GEO_COUNTRIES = ['CA']
_VALID_URL = r'9c9media:stack:(?P<destination_code>[^:]+):(?P<content_id>\d+):(?P<content_package>\d+):(?P<id>\d+)' _VALID_URL = r'9c9media:stack:(?P<destination_code>[^:]+):(?P<content_id>\d+):(?P<content_package>\d+):(?P<id>\d+)'
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -0,0 +1,83 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
get_element_by_class,
urlencode_postdata,
)
class NJPWWorldIE(InfoExtractor):
_VALID_URL = r'https?://njpwworld\.com/p/(?P<id>[a-z0-9_]+)'
IE_DESC = '新日本プロレスワールド'
_NETRC_MACHINE = 'njpwworld'
_TEST = {
'url': 'http://njpwworld.com/p/s_series_00155_1_9/',
'info_dict': {
'id': 's_series_00155_1_9',
'ext': 'mp4',
'title': '第9試合 ランディ・サベージ vs リック・スタイナー',
'tags': list,
},
'params': {
'skip_download': True, # AES-encrypted m3u8
},
'skip': 'Requires login',
}
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
# No authentication to be performed
if not username:
return True
webpage, urlh = self._download_webpage_handle(
'https://njpwworld.com/auth/login', None,
note='Logging in', errnote='Unable to login',
data=urlencode_postdata({'login_id': username, 'pw': password}))
# /auth/login will return 302 for successful logins
if urlh.geturl() == 'https://njpwworld.com/auth/login':
self.report_warning('unable to login')
return False
return True
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
formats = []
for player_url, kind in re.findall(r'<a[^>]+href="(/player[^"]+)".+?<img[^>]+src="[^"]+qf_btn_([^".]+)', webpage):
player_url = compat_urlparse.urljoin(url, player_url)
player_page = self._download_webpage(
player_url, video_id, note='Downloading player page')
entries = self._parse_html5_media_entries(
player_url, player_page, video_id, m3u8_id='hls-%s' % kind,
m3u8_entry_protocol='m3u8_native',
preference=2 if 'hq' in kind else 1)
formats.extend(entries[0]['formats'])
self._sort_formats(formats)
post_content = get_element_by_class('post-content', webpage)
tags = re.findall(
r'<li[^>]+class="tag-[^"]+"><a[^>]*>([^<]+)</a></li>', post_content
) if post_content else None
return {
'id': video_id,
'title': self._og_search_title(webpage),
'formats': formats,
'tags': tags,
}

View File

@ -23,7 +23,7 @@ from ..utils import (
class NocoIE(InfoExtractor): class NocoIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?noco\.tv/emission/|player\.noco\.tv/\?idvideo=)(?P<id>\d+)' _VALID_URL = r'https?://(?:(?:www\.)?noco\.tv/emission/|player\.noco\.tv/\?idvideo=)(?P<id>\d+)'
_LOGIN_URL = 'http://noco.tv/do.php' _LOGIN_URL = 'https://noco.tv/do.php'
_API_URL_TEMPLATE = 'https://api.noco.tv/1.1/%s?ts=%s&tk=%s' _API_URL_TEMPLATE = 'https://api.noco.tv/1.1/%s?ts=%s&tk=%s'
_SUB_LANG_TEMPLATE = '&sub_lang=%s' _SUB_LANG_TEMPLATE = '&sub_lang=%s'
_NETRC_MACHINE = 'noco' _NETRC_MACHINE = 'noco'
@ -69,16 +69,17 @@ class NocoIE(InfoExtractor):
if username is None: if username is None:
return return
login_form = { login = self._download_json(
'a': 'login', self._LOGIN_URL, None, 'Logging in as %s' % username,
'cookie': '1', data=urlencode_postdata({
'username': username, 'a': 'login',
'password': password, 'cookie': '1',
} 'username': username,
request = sanitized_Request(self._LOGIN_URL, urlencode_postdata(login_form)) 'password': password,
request.add_header('Content-Type', 'application/x-www-form-urlencoded; charset=UTF-8') }),
headers={
login = self._download_json(request, None, 'Logging in as %s' % username) 'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
})
if 'erreur' in login: if 'erreur' in login:
raise ExtractorError('Unable to login: %s' % clean_html(login['erreur']), expected=True) raise ExtractorError('Unable to login: %s' % clean_html(login['erreur']), expected=True)

View File

@ -72,16 +72,21 @@ class OpenloadIE(InfoExtractor):
raise ExtractorError('File not found', expected=True) raise ExtractorError('File not found', expected=True)
ol_id = self._search_regex( ol_id = self._search_regex(
'<span[^>]+id="[^"]+"[^>]*>([0-9]+)</span>', '<span[^>]+id="[^"]+"[^>]*>([0-9A-Za-z]+)</span>',
webpage, 'openload ID') webpage, 'openload ID')
first_two_chars = int(float(ol_id[0:][:2])) first_char = int(ol_id[0])
urlcode = [] urlcode = []
num = 2 num = 1
while num < len(ol_id): while num < len(ol_id):
key = int(float(ol_id[num + 3:][:2])) i = ord(ol_id[num])
urlcode.append((key, compat_chr(int(float(ol_id[num:][:3])) - first_two_chars))) key = 0
if i <= 90:
key = i - 65
elif i >= 97:
key = 25 + i - 97
urlcode.append((key, compat_chr(int(ol_id[num + 2:num + 5]) // int(ol_id[num + 1]) - first_char)))
num += 5 num += 5
video_url = 'https://openload.co/stream/' + ''.join( video_url = 'https://openload.co/stream/' + ''.join(

View File

@ -0,0 +1,42 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class SkylineWebcamsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?skylinewebcams\.com/[^/]+/webcam/(?:[^/]+/)+(?P<id>[^/]+)\.html'
_TEST = {
'url': 'https://www.skylinewebcams.com/it/webcam/italia/lazio/roma/scalinata-piazza-di-spagna-barcaccia.html',
'info_dict': {
'id': 'scalinata-piazza-di-spagna-barcaccia',
'ext': 'mp4',
'title': 're:^Live Webcam Scalinata di Piazza di Spagna - La Barcaccia [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'Roma, veduta sulla Scalinata di Piazza di Spagna e sulla Barcaccia',
'is_live': True,
},
'params': {
'skip_download': True,
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
stream_url = self._search_regex(
r'url\s*:\s*(["\'])(?P<url>(?:https?:)?//.+?\.m3u8.*?)\1', webpage,
'stream url', group='url')
title = self._og_search_title(webpage)
description = self._og_search_description(webpage)
return {
'id': video_id,
'url': stream_url,
'ext': 'mp4',
'title': self._live_title(title),
'description': description,
'is_live': True,
}

View File

@ -108,12 +108,11 @@ class SohuIE(InfoExtractor):
if vid_data['play'] != 1: if vid_data['play'] != 1:
if vid_data.get('status') == 12: if vid_data.get('status') == 12:
raise ExtractorError( raise ExtractorError(
'Sohu said: There\'s something wrong in the video.', '%s said: There\'s something wrong in the video.' % self.IE_NAME,
expected=True) expected=True)
else: else:
raise ExtractorError( self.raise_geo_restricted(
'Sohu said: The video is only licensed to users in Mainland China.', '%s said: The video is only licensed to users in Mainland China.' % self.IE_NAME)
expected=True)
formats_json = {} formats_json = {}
for format_id in ('nor', 'high', 'super', 'ori', 'h2644k', 'h2654k'): for format_id in ('nor', 'high', 'super', 'ori', 'h2644k', 'h2654k'):

View File

@ -23,6 +23,10 @@ class SpankBangIE(InfoExtractor):
# 480p only # 480p only
'url': 'http://spankbang.com/1vt0/video/solvane+gangbang', 'url': 'http://spankbang.com/1vt0/video/solvane+gangbang',
'only_matching': True, 'only_matching': True,
}, {
# no uploader
'url': 'http://spankbang.com/lklg/video/sex+with+anyone+wedding+edition+2',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -48,7 +52,7 @@ class SpankBangIE(InfoExtractor):
thumbnail = self._og_search_thumbnail(webpage) thumbnail = self._og_search_thumbnail(webpage)
uploader = self._search_regex( uploader = self._search_regex(
r'class="user"[^>]*><img[^>]+>([^<]+)', r'class="user"[^>]*><img[^>]+>([^<]+)',
webpage, 'uploader', fatal=False) webpage, 'uploader', default=None)
age_limit = self._rta_search(webpage) age_limit = self._rta_search(webpage)

View File

@ -2,7 +2,10 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import int_or_none from ..utils import (
int_or_none,
smuggle_url,
)
class TeleQuebecIE(InfoExtractor): class TeleQuebecIE(InfoExtractor):
@ -28,7 +31,7 @@ class TeleQuebecIE(InfoExtractor):
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'id': media_id, 'id': media_id,
'url': 'limelight:media:' + media_data['streamInfo']['sourceId'], 'url': smuggle_url('limelight:media:' + media_data['streamInfo']['sourceId'], {'geo_countries': ['CA']}),
'title': media_data['title'], 'title': media_data['title'],
'description': media_data.get('descriptions', [{'text': None}])[0].get('text'), 'description': media_data.get('descriptions', [{'text': None}])[0].get('text'),
'duration': int_or_none(media_data.get('durationInMilliseconds'), 1000), 'duration': int_or_none(media_data.get('durationInMilliseconds'), 1000),

View File

@ -8,10 +8,12 @@ from ..utils import (
HEADRequest, HEADRequest,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
clean_html,
) )
class TFOIE(InfoExtractor): class TFOIE(InfoExtractor):
_GEO_COUNTRIES = ['CA']
_VALID_URL = r'https?://(?:www\.)?tfo\.org/(?:en|fr)/(?:[^/]+/){2}(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?tfo\.org/(?:en|fr)/(?:[^/]+/){2}(?P<id>\d+)'
_TEST = { _TEST = {
'url': 'http://www.tfo.org/en/universe/tfo-247/100463871/video-game-hackathon', 'url': 'http://www.tfo.org/en/universe/tfo-247/100463871/video-game-hackathon',
@ -36,7 +38,9 @@ class TFOIE(InfoExtractor):
'X-tfo-session': self._get_cookies('http://www.tfo.org/')['tfo-session'].value, 'X-tfo-session': self._get_cookies('http://www.tfo.org/')['tfo-session'].value,
}) })
if infos.get('success') == 0: if infos.get('success') == 0:
raise ExtractorError('%s said: %s' % (self.IE_NAME, infos['msg']), expected=True) if infos.get('code') == 'ErrGeoBlocked':
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
raise ExtractorError('%s said: %s' % (self.IE_NAME, clean_html(infos['msg'])), expected=True)
video_data = infos['data'] video_data = infos['data']
return { return {

View File

@ -3,7 +3,10 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_urlparse from ..compat import compat_urlparse
from ..utils import qualities from ..utils import (
int_or_none,
qualities,
)
class TheSceneIE(InfoExtractor): class TheSceneIE(InfoExtractor):
@ -16,6 +19,11 @@ class TheSceneIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Narciso Rodriguez: Spring 2013 Ready-to-Wear', 'title': 'Narciso Rodriguez: Spring 2013 Ready-to-Wear',
'display_id': 'narciso-rodriguez-spring-2013-ready-to-wear', 'display_id': 'narciso-rodriguez-spring-2013-ready-to-wear',
'duration': 127,
'series': 'Style.com Fashion Shows',
'season': 'Ready To Wear Spring 2013',
'tags': list,
'categories': list,
}, },
} }
@ -32,21 +40,29 @@ class TheSceneIE(InfoExtractor):
player = self._download_webpage(player_url, display_id) player = self._download_webpage(player_url, display_id)
info = self._parse_json( info = self._parse_json(
self._search_regex( self._search_regex(
r'(?m)var\s+video\s+=\s+({.+?});$', player, 'info json'), r'(?m)video\s*:\s*({.+?}),$', player, 'info json'),
display_id) display_id)
video_id = info['id']
title = info['title']
qualities_order = qualities(('low', 'high')) qualities_order = qualities(('low', 'high'))
formats = [{ formats = [{
'format_id': '{0}-{1}'.format(f['type'].split('/')[0], f['quality']), 'format_id': '{0}-{1}'.format(f['type'].split('/')[0], f['quality']),
'url': f['src'], 'url': f['src'],
'quality': qualities_order(f['quality']), 'quality': qualities_order(f['quality']),
} for f in info['sources'][0]] } for f in info['sources']]
self._sort_formats(formats) self._sort_formats(formats)
return { return {
'id': info['id'], 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': info['title'], 'title': title,
'formats': formats, 'formats': formats,
'thumbnail': info.get('poster_frame'), 'thumbnail': info.get('poster_frame'),
'duration': int_or_none(info.get('duration')),
'series': info.get('series_title'),
'season': info.get('season_title'),
'tags': info.get('tags'),
'categories': info.get('categories'),
} }

View File

@ -16,6 +16,7 @@ class TubiTvIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tubitv\.com/video/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?tubitv\.com/video/(?P<id>[0-9]+)'
_LOGIN_URL = 'http://tubitv.com/login' _LOGIN_URL = 'http://tubitv.com/login'
_NETRC_MACHINE = 'tubitv' _NETRC_MACHINE = 'tubitv'
_GEO_COUNTRIES = ['US']
_TEST = { _TEST = {
'url': 'http://tubitv.com/video/283829/the_comedian_at_the_friday', 'url': 'http://tubitv.com/video/283829/the_comedian_at_the_friday',
'md5': '43ac06be9326f41912dc64ccf7a80320', 'md5': '43ac06be9326f41912dc64ccf7a80320',

View File

@ -17,6 +17,9 @@ class TvigleIE(InfoExtractor):
IE_DESC = 'Интернет-телевидение Tvigle.ru' IE_DESC = 'Интернет-телевидение Tvigle.ru'
_VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))' _VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['RU']
_TESTS = [ _TESTS = [
{ {
'url': 'http://www.tvigle.ru/video/sokrat/', 'url': 'http://www.tvigle.ru/video/sokrat/',
@ -72,8 +75,13 @@ class TvigleIE(InfoExtractor):
error_message = item.get('errorMessage') error_message = item.get('errorMessage')
if not videos and error_message: if not videos and error_message:
raise ExtractorError( if item.get('isGeoBlocked') is True:
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True) self.raise_geo_restricted(
msg=error_message, countries=self._GEO_COUNTRIES)
else:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message),
expected=True)
title = item['title'] title = item['title']
description = item.get('description') description = item.get('description')

View File

@ -12,7 +12,7 @@ from ..utils import (
class TwentyFourVideoIE(InfoExtractor): class TwentyFourVideoIE(InfoExtractor):
IE_NAME = '24video' IE_NAME = '24video'
_VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx|sex)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx|sex|tube)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.24video.net/video/view/1044982', 'url': 'http://www.24video.net/video/view/1044982',
@ -37,6 +37,9 @@ class TwentyFourVideoIE(InfoExtractor):
}, { }, {
'url': 'http://www.24video.me/video/view/1044982', 'url': 'http://www.24video.me/video/view/1044982',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://www.24video.tube/video/view/2363750',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -17,12 +17,12 @@ from ..utils import (
class VevoBaseIE(InfoExtractor): class VevoBaseIE(InfoExtractor):
def _extract_json(self, webpage, video_id, item): def _extract_json(self, webpage, video_id):
return self._parse_json( return self._parse_json(
self._search_regex( self._search_regex(
r'window\.__INITIAL_STORE__\s*=\s*({.+?});\s*</script>', r'window\.__INITIAL_STORE__\s*=\s*({.+?});\s*</script>',
webpage, 'initial store'), webpage, 'initial store'),
video_id)['default'][item] video_id)
class VevoIE(VevoBaseIE): class VevoIE(VevoBaseIE):
@ -139,6 +139,11 @@ class VevoIE(VevoBaseIE):
# no genres available # no genres available
'url': 'http://www.vevo.com/watch/INS171400764', 'url': 'http://www.vevo.com/watch/INS171400764',
'only_matching': True, 'only_matching': True,
}, {
# Another case available only via the webpage; using streams/streamsV3 formats
# Geo-restricted to Netherlands/Germany
'url': 'http://www.vevo.com/watch/boostee/pop-corn-clip-officiel/FR1A91600909',
'only_matching': True,
}] }]
_VERSIONS = { _VERSIONS = {
0: 'youtube', # only in AuthenticateVideo videoVersions 0: 'youtube', # only in AuthenticateVideo videoVersions
@ -193,7 +198,14 @@ class VevoIE(VevoBaseIE):
# https://github.com/rg3/youtube-dl/issues/9366) # https://github.com/rg3/youtube-dl/issues/9366)
if not video_versions: if not video_versions:
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
video_versions = self._extract_json(webpage, video_id, 'streams')[video_id][0] json_data = self._extract_json(webpage, video_id)
if 'streams' in json_data.get('default', {}):
video_versions = json_data['default']['streams'][video_id][0]
else:
video_versions = [
value
for key, value in json_data['apollo']['data'].items()
if key.startswith('%s.streams' % video_id)]
uploader = None uploader = None
artist = None artist = None
@ -207,7 +219,7 @@ class VevoIE(VevoBaseIE):
formats = [] formats = []
for video_version in video_versions: for video_version in video_versions:
version = self._VERSIONS.get(video_version['version']) version = self._VERSIONS.get(video_version.get('version'), 'generic')
version_url = video_version.get('url') version_url = video_version.get('url')
if not version_url: if not version_url:
continue continue
@ -339,7 +351,7 @@ class VevoPlaylistIE(VevoBaseIE):
if video_id: if video_id:
return self.url_result('vevo:%s' % video_id, VevoIE.ie_key()) return self.url_result('vevo:%s' % video_id, VevoIE.ie_key())
playlists = self._extract_json(webpage, playlist_id, '%ss' % playlist_kind) playlists = self._extract_json(webpage, playlist_id)['default']['%ss' % playlist_kind]
playlist = (list(playlists.values())[0] playlist = (list(playlists.values())[0]
if playlist_kind == 'playlist' else playlists[playlist_id]) if playlist_kind == 'playlist' else playlists[playlist_id])

View File

@ -13,7 +13,7 @@ from ..utils import (
class VidziIE(InfoExtractor): class VidziIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vidzi\.tv/(?:embed-)?(?P<id>[0-9a-zA-Z]+)' _VALID_URL = r'https?://(?:www\.)?vidzi\.(?:tv|cc)/(?:embed-)?(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://vidzi.tv/cghql9yq6emu.html', 'url': 'http://vidzi.tv/cghql9yq6emu.html',
'md5': '4f16c71ca0c8c8635ab6932b5f3f1660', 'md5': '4f16c71ca0c8c8635ab6932b5f3f1660',
@ -29,6 +29,9 @@ class VidziIE(InfoExtractor):
}, { }, {
'url': 'http://vidzi.tv/embed-4z2yb0rzphe9-600x338.html', 'url': 'http://vidzi.tv/embed-4z2yb0rzphe9-600x338.html',
'skip_download': True, 'skip_download': True,
}, {
'url': 'http://vidzi.cc/cghql9yq6emu.html',
'skip_download': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -86,7 +86,9 @@ class ViewsterIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
# Get 'api_token' cookie # Get 'api_token' cookie
self._request_webpage(HEADRequest('http://www.viewster.com/'), video_id) self._request_webpage(
HEADRequest('http://www.viewster.com/'),
video_id, headers=self.geo_verification_headers())
cookies = self._get_cookies('http://www.viewster.com/') cookies = self._get_cookies('http://www.viewster.com/')
self._AUTH_TOKEN = compat_urllib_parse_unquote(cookies['api_token'].value) self._AUTH_TOKEN = compat_urllib_parse_unquote(cookies['api_token'].value)

View File

@ -5,6 +5,7 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
dict_get, dict_get,
ExtractorError,
int_or_none, int_or_none,
parse_duration, parse_duration,
unified_strdate, unified_strdate,
@ -57,6 +58,10 @@ class XHamsterIE(InfoExtractor):
}, { }, {
'url': 'https://xhamster.com/movies/2272726/amber_slayed_by_the_knight.html', 'url': 'https://xhamster.com/movies/2272726/amber_slayed_by_the_knight.html',
'only_matching': True, 'only_matching': True,
}, {
# This video is visible for marcoalfa123456's friends only
'url': 'https://it.xhamster.com/movies/7263980/la_mia_vicina.html',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -78,6 +83,12 @@ class XHamsterIE(InfoExtractor):
mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo) mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo)
webpage = self._download_webpage(mrss_url, video_id) webpage = self._download_webpage(mrss_url, video_id)
error = self._html_search_regex(
r'<div[^>]+id=["\']videoClosed["\'][^>]*>(.+?)</div>',
webpage, 'error', default=None)
if error:
raise ExtractorError(error, expected=True)
title = self._html_search_regex( title = self._html_search_regex(
[r'<h1[^>]*>([^<]+)</h1>', [r'<h1[^>]*>([^<]+)</h1>',
r'<meta[^>]+itemprop=".*?caption.*?"[^>]+content="(.+?)"', r'<meta[^>]+itemprop=".*?caption.*?"[^>]+content="(.+?)"',

View File

@ -47,7 +47,6 @@ from ..utils import (
unsmuggle_url, unsmuggle_url,
uppercase_escape, uppercase_escape,
urlencode_postdata, urlencode_postdata,
ISO3166Utils,
) )
@ -371,6 +370,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
} }
_SUBTITLE_FORMATS = ('ttml', 'vtt') _SUBTITLE_FORMATS = ('ttml', 'vtt')
_GEO_BYPASS = False
IE_NAME = 'youtube' IE_NAME = 'youtube'
_TESTS = [ _TESTS = [
{ {
@ -917,7 +918,12 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# itag 212 # itag 212
'url': '1t24XAntNCY', 'url': '1t24XAntNCY',
'only_matching': True, 'only_matching': True,
} },
{
# geo restricted to JP
'url': 'sJL6WA-aGkQ',
'only_matching': True,
},
] ]
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
@ -1376,11 +1382,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if 'token' not in video_info: if 'token' not in video_info:
if 'reason' in video_info: if 'reason' in video_info:
if 'The uploader has not made this video available in your country.' in video_info['reason']: if 'The uploader has not made this video available in your country.' in video_info['reason']:
regions_allowed = self._html_search_meta('regionsAllowed', video_webpage, default=None) regions_allowed = self._html_search_meta(
if regions_allowed: 'regionsAllowed', video_webpage, default=None)
raise ExtractorError('YouTube said: This video is available in %s only' % ( countries = regions_allowed.split(',') if regions_allowed else None
', '.join(map(ISO3166Utils.short2full, regions_allowed.split(',')))), self.raise_geo_restricted(
expected=True) msg=video_info['reason'][0], countries=countries)
raise ExtractorError( raise ExtractorError(
'YouTube said: %s' % video_info['reason'][0], 'YouTube said: %s' % video_info['reason'][0],
expected=True, video_id=video_id) expected=True, video_id=video_id)
@ -2126,6 +2132,10 @@ class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
'id': 'UUs0ifCMCm1icqRbqhUINa0w', 'id': 'UUs0ifCMCm1icqRbqhUINa0w',
'title': 'Uploads from Deus Ex', 'title': 'Uploads from Deus Ex',
}, },
}, {
# geo restricted to JP
'url': 'https://www.youtube.com/user/kananishinoSMEJ',
'only_matching': True,
}] }]
@classmethod @classmethod

View File

@ -679,8 +679,8 @@ def parseOpts(overrideArguments=None):
help=('Output filename template, see the "OUTPUT TEMPLATE" for all the info')) help=('Output filename template, see the "OUTPUT TEMPLATE" for all the info'))
filesystem.add_option( filesystem.add_option(
'--autonumber-size', '--autonumber-size',
dest='autonumber_size', metavar='NUMBER', default=5, type=int, dest='autonumber_size', metavar='NUMBER', type=int,
help='Specify the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given (default is %default)') help=optparse.SUPPRESS_HELP)
filesystem.add_option( filesystem.add_option(
'--autonumber-start', '--autonumber-start',
dest='autonumber_start', metavar='NUMBER', default=1, type=int, dest='autonumber_start', metavar='NUMBER', default=1, type=int,
@ -692,15 +692,15 @@ def parseOpts(overrideArguments=None):
filesystem.add_option( filesystem.add_option(
'-A', '--auto-number', '-A', '--auto-number',
action='store_true', dest='autonumber', default=False, action='store_true', dest='autonumber', default=False,
help='[deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] Number downloaded files starting from 00000') help=optparse.SUPPRESS_HELP)
filesystem.add_option( filesystem.add_option(
'-t', '--title', '-t', '--title',
action='store_true', dest='usetitle', default=False, action='store_true', dest='usetitle', default=False,
help='[deprecated] Use title in file name (default)') help=optparse.SUPPRESS_HELP)
filesystem.add_option( filesystem.add_option(
'-l', '--literal', default=False, '-l', '--literal', default=False,
action='store_true', dest='usetitle', action='store_true', dest='usetitle',
help='[deprecated] Alias of --title') help=optparse.SUPPRESS_HELP)
filesystem.add_option( filesystem.add_option(
'-w', '--no-overwrites', '-w', '--no-overwrites',
action='store_true', dest='nooverwrites', default=False, action='store_true', dest='nooverwrites', default=False,

View File

@ -536,8 +536,7 @@ class FFmpegSubtitlesConvertorPP(FFmpegPostProcessor):
ext = sub['ext'] ext = sub['ext']
if ext == new_ext: if ext == new_ext:
self._downloader.to_screen( self._downloader.to_screen(
'[ffmpeg] Subtitle file for %s is already in the requested' '[ffmpeg] Subtitle file for %s is already in the requested format' % new_ext)
'format' % new_ext)
continue continue
old_file = subtitles_filename(filename, lang, ext) old_file = subtitles_filename(filename, lang, ext)
sub_filenames.append(old_file) sub_filenames.append(old_file)

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2017.02.17' __version__ = '2017.02.24.1'