Merge pull request #3 from rg3/master

a
This commit is contained in:
世外桃源 2016-12-07 20:25:48 +08:00 committed by GitHub
commit ce805ca201
81 changed files with 1094 additions and 416 deletions

View File

@ -6,8 +6,8 @@
--- ---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.11.08.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. ### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.12.01*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.11.08.1** - [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.12.01**
### Before submitting an *issue* make sure you have: ### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.11.08.1 [debug] youtube-dl version 2016.12.01
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

3
.gitignore vendored
View File

@ -31,6 +31,9 @@ updates_key.pem
*.mp3 *.mp3
*.3gp *.3gp
*.wav *.wav
*.ape
*.mkv
*.swf
*.part *.part
*.swp *.swp
test/testdata test/testdata

View File

@ -92,7 +92,7 @@ If you want to create a build of youtube-dl yourself, you'll need
### Adding support for a new site ### Adding support for a new site
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**. If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`): After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):

View File

@ -1,6 +1,72 @@
version <unreleased> version <unreleased>
Extractors Extractors
+ [thisoldhouse] Recognize /tv-episode/ URLs (#11271)
version 2016.12.01
Extractors
* [soundcloud] Update client id (#11327)
* [ruutu] Detect DRM protected videos
+ [liveleak] Add support for youtube embeds (#10688)
* [spike] Fix full episodes support (#11312)
* [comedycentral] Fix full episodes support
* [normalboots] Rewrite in terms of JWPlatform (#11184)
* [teamfourstar] Rewrite in terms of JWPlatform (#11184)
- [screenwavemedia] Remove extractor (#11184)
version 2016.11.27
Extractors
+ [webcaster] Add support for webcaster.pro
+ [azubu] Add support for azubu.uol.com.br (#11305)
* [viki] Prefer hls formats
* [viki] Fix rtmp formats extraction (#11255)
* [puls4] Relax URL regular expression (#11267)
* [vevo] Improve artist extraction (#10911)
* [mitele] Relax URL regular expression and extract more metadata (#11244)
+ [cbslocal] Recognize New York site (#11285)
+ [youtube:playlist] Pass disable_polymer in URL query (#11193)
version 2016.11.22
Extractors
* [hellporno] Fix video extension extraction (#11247)
+ [hellporno] Add support for hellporno.net (#11247)
+ [amcnetworks] Recognize more BBC America URLs (#11263)
* [funnyordie] Improve extraction (#11208)
* [extractor/generic] Improve limelight embeds support
- [crunchyroll] Remove ScaledBorderAndShadow from ASS subtitles (#8207, #9028)
* [bandcamp] Fix free downloads extraction and extract all formats (#11067)
* [twitter:card] Relax URL regular expression (#11225)
+ [tvanouvelles] Add support for tvanouvelles.ca (#10616)
version 2016.11.18
Extractors
* [youtube:live] Relax URL regular expression (#11164)
* [openload] Fix extraction (#10408, #11122)
* [vlive] Prefer locale over language for subtitles id (#11203)
version 2016.11.14.1
Core
+ [downoader/fragment,f4m,hls] Respect HTTP headers from info dict
* [extractor/common] Fix media templates with Bandwidth substitution pattern in
MPD manifests (#11175)
* [extractor/common] Improve thumbnail extraction from JSON-LD
Extractors
+ [nrk] Workaround geo restriction
+ [nrk] Improve error detection and messages
+ [afreecatv] Add support for vod.afreecatv.com (#11174)
* [cda] Fix and improve extraction (#10929, #10936)
* [plays] Fix extraction (#11165)
* [eagleplatform] Fix extraction (#11160)
+ [audioboom] Recognize /posts/ URLs (#11149) + [audioboom] Recognize /posts/ URLs (#11149)

View File

@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean: clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete find . -name "*.pyc" -delete
find . -name "*.class" -delete find . -name "*.class" -delete

View File

@ -664,7 +664,7 @@ $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but not better that 480p # Download best format available but not better that 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]' $ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger that 50 MB # Download best video only format but no bigger than 50 MB
$ youtube-dl -f 'best[filesize<50M]' $ youtube-dl -f 'best[filesize<50M]'
# Download best format available via direct link over HTTP/HTTPS protocol # Download best format available via direct link over HTTP/HTTPS protocol
@ -930,7 +930,7 @@ If you want to create a build of youtube-dl yourself, you'll need
### Adding support for a new site ### Adding support for a new site
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**. If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`): After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):

View File

@ -25,5 +25,6 @@ def build_completion(opt_parser):
filled_template = template.replace("{{flags}}", " ".join(opts_flag)) filled_template = template.replace("{{flags}}", " ".join(opts_flag))
f.write(filled_template) f.write(filled_template)
parser = youtube_dl.parseOpts()[0] parser = youtube_dl.parseOpts()[0]
build_completion(parser) build_completion(parser)

View File

@ -2,11 +2,13 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import base64 import base64
import io
import json import json
import mimetypes import mimetypes
import netrc import netrc
import optparse import optparse
import os import os
import re
import sys import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
@ -90,16 +92,23 @@ class GitHubReleaser(object):
def main(): def main():
parser = optparse.OptionParser(usage='%prog VERSION BUILDPATH') parser = optparse.OptionParser(usage='%prog CHANGELOG VERSION BUILDPATH')
options, args = parser.parse_args() options, args = parser.parse_args()
if len(args) != 2: if len(args) != 3:
parser.error('Expected a version and a build directory') parser.error('Expected a version and a build directory')
version, build_path = args changelog_file, version, build_path = args
with io.open(changelog_file, encoding='utf-8') as inf:
changelog = inf.read()
mobj = re.search(r'(?s)version %s\n{2}(.+?)\n{3}' % version, changelog)
body = mobj.group(1) if mobj else ''
releaser = GitHubReleaser() releaser = GitHubReleaser()
new_release = releaser.create_release(version, name='youtube-dl %s' % version) new_release = releaser.create_release(
version, name='youtube-dl %s' % version, body=body)
release_id = new_release['id'] release_id = new_release['id']
for asset in os.listdir(build_path): for asset in os.listdir(build_path):

View File

@ -44,5 +44,6 @@ def build_completion(opt_parser):
with open(FISH_COMPLETION_FILE, 'w') as f: with open(FISH_COMPLETION_FILE, 'w') as f:
f.write(filled_template) f.write(filled_template)
parser = youtube_dl.parseOpts()[0] parser = youtube_dl.parseOpts()[0]
build_completion(parser) build_completion(parser)

View File

@ -23,6 +23,7 @@ def openssl_encode(algo, key, iv):
out, _ = prog.communicate(secret_msg) out, _ = prog.communicate(secret_msg)
return out return out
iv = key = [0x20, 0x15] + 14 * [0] iv = key = [0x20, 0x15] + 14 * [0]
r = openssl_encode('aes-128-cbc', key, iv) r = openssl_encode('aes-128-cbc', key, iv)

View File

@ -32,5 +32,6 @@ def main():
with open('supportedsites.html', 'w', encoding='utf-8') as sitesf: with open('supportedsites.html', 'w', encoding='utf-8') as sitesf:
sitesf.write(template) sitesf.write(template)
if __name__ == '__main__': if __name__ == '__main__':
main() main()

View File

@ -28,5 +28,6 @@ def main():
with io.open(outfile, 'w', encoding='utf-8') as outf: with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out) outf.write(out)
if __name__ == '__main__': if __name__ == '__main__':
main() main()

View File

@ -59,6 +59,7 @@ def build_lazy_ie(ie, name):
s += make_valid_template.format(valid_url=ie._make_valid_url()) s += make_valid_template.format(valid_url=ie._make_valid_url())
return s return s
# find the correct sorting and add the required base classes so that sublcasses # find the correct sorting and add the required base classes so that sublcasses
# can be correctly created # can be correctly created
classes = _ALL_CLASSES[:-1] classes = _ALL_CLASSES[:-1]

View File

@ -41,5 +41,6 @@ def main():
with io.open(outfile, 'w', encoding='utf-8') as outf: with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out) outf.write(out)
if __name__ == '__main__': if __name__ == '__main__':
main() main()

View File

@ -74,5 +74,6 @@ def filter_options(readme):
return ret return ret
if __name__ == '__main__': if __name__ == '__main__':
main() main()

View File

@ -110,7 +110,7 @@ RELEASE_FILES="youtube-dl youtube-dl.exe youtube-dl-$version.tar.gz"
for f in $RELEASE_FILES; do gpg --passphrase-repeat 5 --detach-sig "build/$version/$f"; done for f in $RELEASE_FILES; do gpg --passphrase-repeat 5 --detach-sig "build/$version/$f"; done
ROOT=$(pwd) ROOT=$(pwd)
python devscripts/create-github-release.py $version "$ROOT/build/$version" python devscripts/create-github-release.py ChangeLog $version "$ROOT/build/$version"
ssh ytdl@yt-dl.org "sh html/update_latest.sh $version" ssh ytdl@yt-dl.org "sh html/update_latest.sh $version"

View File

@ -44,5 +44,6 @@ def build_completion(opt_parser):
with open(ZSH_COMPLETION_FILE, "w") as f: with open(ZSH_COMPLETION_FILE, "w") as f:
f.write(template) f.write(template)
parser = youtube_dl.parseOpts()[0] parser = youtube_dl.parseOpts()[0]
build_completion(parser) build_completion(parser)

View File

@ -158,6 +158,7 @@
- **CollegeRama** - **CollegeRama**
- **ComCarCoff** - **ComCarCoff**
- **ComedyCentral** - **ComedyCentral**
- **ComedyCentralFullEpisodes**
- **ComedyCentralShortname** - **ComedyCentralShortname**
- **ComedyCentralTV** - **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
@ -643,7 +644,6 @@
- **Screencast** - **Screencast**
- **ScreencastOMatic** - **ScreencastOMatic**
- **ScreenJunkies** - **ScreenJunkies**
- **ScreenwaveMedia**
- **Seeker** - **Seeker**
- **SenateISVP** - **SenateISVP**
- **SendtoNews** - **SendtoNews**
@ -715,7 +715,7 @@
- **teachertube:user:collection**: teachertube.com user and collection videos - **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel** - **TeachingChannel**
- **Teamcoco** - **Teamcoco**
- **TeamFour** - **TeamFourStar**
- **TechTalks** - **TechTalks**
- **techtv.mit.edu** - **techtv.mit.edu**
- **ted** - **ted**
@ -771,6 +771,8 @@
- **TV2Article** - **TV2Article**
- **TV3** - **TV3**
- **TV4**: tv4.se and tv4play.se - **TV4**: tv4.se and tv4play.se
- **TVANouvelles**
- **TVANouvellesArticle**
- **TVC** - **TVC**
- **TVCArticle** - **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru - **tvigle**: Интернет-телевидение Tvigle.ru
@ -880,6 +882,8 @@
- **WatchIndianPorn**: Watch Indian Porn - **WatchIndianPorn**: Watch Indian Porn
- **WDR** - **WDR**
- **wdr:mobile** - **wdr:mobile**
- **Webcaster**
- **WebcasterFeed**
- **WebOfStories** - **WebOfStories**
- **WebOfStoriesPlaylist** - **WebOfStoriesPlaylist**
- **WeiqiTV**: WQTV - **WeiqiTV**: WQTV

View File

@ -84,5 +84,6 @@ class TestInfoExtractor(unittest.TestCase):
self.assertRaises(ExtractorError, self.ie._download_json, uri, None) self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None) self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -51,5 +51,6 @@ class TestAES(unittest.TestCase):
decrypted = (aes_decrypt_text(encrypted, password, 32)) decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -60,6 +60,7 @@ def _file_md5(fn):
with open(fn, 'rb') as f: with open(fn, 'rb') as f:
return hashlib.md5(f.read()).hexdigest() return hashlib.md5(f.read()).hexdigest()
defs = gettestcases() defs = gettestcases()
@ -217,6 +218,7 @@ def generator(test_case):
return test_template return test_template
# And add them to TestDownload # And add them to TestDownload
for n, test_case in enumerate(defs): for n, test_case in enumerate(defs):
test_method = generator(test_case) test_method = generator(test_case)

View File

@ -39,5 +39,6 @@ class TestExecution(unittest.TestCase):
_, stderr = p.communicate() _, stderr = p.communicate()
self.assertFalse(stderr) self.assertFalse(stderr)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -169,5 +169,6 @@ class TestProxy(unittest.TestCase):
# b'xn--fiq228c' is '中文'.encode('idna') # b'xn--fiq228c' is '中文'.encode('idna')
self.assertEqual(response, 'normal: http://xn--fiq228c.tw/') self.assertEqual(response, 'normal: http://xn--fiq228c.tw/')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -43,5 +43,6 @@ class TestIqiyiSDKInterpreter(unittest.TestCase):
ie._login() ie._login()
self.assertTrue('unable to log in:' in logger.messages[0]) self.assertTrue('unable to log in:' in logger.messages[0])
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -104,6 +104,14 @@ class TestJSInterpreter(unittest.TestCase):
}''') }''')
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50]) self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
def test_call(self):
jsi = JSInterpreter('''
function x() { return 2; }
function y(a) { return x() + a; }
function z() { return y(3); }
''')
self.assertEqual(jsi.call_function('z'), 5)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -1075,5 +1075,6 @@ The first line
self.assertEqual(get_element_by_class('foo', html), 'nice') self.assertEqual(get_element_by_class('foo', html), 'nice')
self.assertEqual(get_element_by_class('no-such-class', html), None) self.assertEqual(get_element_by_class('no-such-class', html), None)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -66,5 +66,6 @@ class TestVerboseOutput(unittest.TestCase):
self.assertTrue(b'-p' in serr) self.assertTrue(b'-p' in serr)
self.assertTrue(b'secret' not in serr) self.assertTrue(b'secret' not in serr)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -24,6 +24,7 @@ class YoutubeDL(youtube_dl.YoutubeDL):
super(YoutubeDL, self).__init__(*args, **kwargs) super(YoutubeDL, self).__init__(*args, **kwargs)
self.to_stderr = self.to_screen self.to_stderr = self.to_screen
params = get_params({ params = get_params({
'writeannotations': True, 'writeannotations': True,
'skip_download': True, 'skip_download': True,
@ -74,5 +75,6 @@ class TestAnnotations(unittest.TestCase):
def tearDown(self): def tearDown(self):
try_rm(ANNOTATIONS_FILE) try_rm(ANNOTATIONS_FILE)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -66,5 +66,6 @@ class TestYoutubeLists(unittest.TestCase):
for entry in result['entries']: for entry in result['entries']:
self.assertTrue(entry.get('title')) self.assertTrue(entry.get('title'))
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -114,6 +114,7 @@ def make_tfunc(url, stype, sig_input, expected_sig):
test_func.__name__ = str('test_signature_' + stype + '_' + test_id) test_func.__name__ = str('test_signature_' + stype + '_' + test_id)
setattr(TestSignature, test_func.__name__, test_func) setattr(TestSignature, test_func.__name__, test_func)
for test_spec in _TESTS: for test_spec in _TESTS:
make_tfunc(*test_spec) make_tfunc(*test_spec)

View File

@ -95,8 +95,7 @@ def _real_main(argv=None):
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n') write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except IOError: except IOError:
sys.exit('ERROR: batch file could not be read') sys.exit('ERROR: batch file could not be read')
all_urls = batch_urls + args all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls
all_urls = [url.strip() for url in all_urls]
_enc = preferredencoding() _enc = preferredencoding()
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls] all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
@ -450,4 +449,5 @@ def main(argv=None):
except KeyboardInterrupt: except KeyboardInterrupt:
sys.exit('\nERROR: Interrupted by user') sys.exit('\nERROR: Interrupted by user')
__all__ = ['main', 'YoutubeDL', 'gen_extractors', 'list_extractors'] __all__ = ['main', 'YoutubeDL', 'gen_extractors', 'list_extractors']

View File

@ -174,6 +174,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
return plaintext return plaintext
RCON = (0x8d, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36) RCON = (0x8d, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36)
SBOX = (0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5, 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76, SBOX = (0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5, 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76,
0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0, 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0, 0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0, 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0,
@ -328,4 +329,5 @@ def inc(data):
break break
return data return data
__all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text'] __all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text']

View File

@ -2491,6 +2491,7 @@ class _TreeBuilder(etree.TreeBuilder):
def doctype(self, name, pubid, system): def doctype(self, name, pubid, system):
pass pass
if sys.version_info[0] >= 3: if sys.version_info[0] >= 3:
def compat_etree_fromstring(text): def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder())) return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder()))
@ -2787,6 +2788,7 @@ def workaround_optparse_bug9161():
return real_add_option(self, *bargs, **bkwargs) return real_add_option(self, *bargs, **bkwargs)
optparse.OptionGroup.add_option = _compat_add_option optparse.OptionGroup.add_option = _compat_add_option
if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3 if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3
compat_get_terminal_size = shutil.get_terminal_size compat_get_terminal_size = shutil.get_terminal_size
else: else:

View File

@ -293,6 +293,7 @@ class FFmpegFD(ExternalFD):
class AVconvFD(FFmpegFD): class AVconvFD(FFmpegFD):
pass pass
_BY_NAME = dict( _BY_NAME = dict(
(klass.get_basename(), klass) (klass.get_basename(), klass)
for name, klass in globals().items() for name, klass in globals().items()

View File

@ -314,7 +314,8 @@ class F4mFD(FragmentFD):
man_url = info_dict['url'] man_url = info_dict['url']
requested_bitrate = info_dict.get('tbr') requested_bitrate = info_dict.get('tbr')
self.to_screen('[%s] Downloading f4m manifest' % self.FD_NAME) self.to_screen('[%s] Downloading f4m manifest' % self.FD_NAME)
urlh = self.ydl.urlopen(man_url)
urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
man_url = urlh.geturl() man_url = urlh.geturl()
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests # Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244 # (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244
@ -387,7 +388,10 @@ class F4mFD(FragmentFD):
url_parsed = base_url_parsed._replace(path=base_url_parsed.path + name, query='&'.join(query)) url_parsed = base_url_parsed._replace(path=base_url_parsed.path + name, query='&'.join(query))
frag_filename = '%s-%s' % (ctx['tmpfilename'], name) frag_filename = '%s-%s' % (ctx['tmpfilename'], name)
try: try:
success = ctx['dl'].download(frag_filename, {'url': url_parsed.geturl()}) success = ctx['dl'].download(frag_filename, {
'url': url_parsed.geturl(),
'http_headers': info_dict.get('http_headers'),
})
if not success: if not success:
return False return False
(down, frag_sanitized) = sanitize_open(frag_filename, 'rb') (down, frag_sanitized) = sanitize_open(frag_filename, 'rb')

View File

@ -9,6 +9,7 @@ from ..utils import (
error_to_compat_str, error_to_compat_str,
encodeFilename, encodeFilename,
sanitize_open, sanitize_open,
sanitized_Request,
) )
@ -37,6 +38,10 @@ class FragmentFD(FileDownloader):
def report_skip_fragment(self, fragment_name): def report_skip_fragment(self, fragment_name):
self.to_screen('[download] Skipping fragment %s...' % fragment_name) self.to_screen('[download] Skipping fragment %s...' % fragment_name)
def _prepare_url(self, info_dict, url):
headers = info_dict.get('http_headers')
return sanitized_Request(url, None, headers) if headers else url
def _prepare_and_start_frag_download(self, ctx): def _prepare_and_start_frag_download(self, ctx):
self._prepare_frag_download(ctx) self._prepare_frag_download(ctx)
self._start_frag_download(ctx) self._start_frag_download(ctx)

View File

@ -59,7 +59,8 @@ class HlsFD(FragmentFD):
def real_download(self, filename, info_dict): def real_download(self, filename, info_dict):
man_url = info_dict['url'] man_url = info_dict['url']
self.to_screen('[%s] Downloading m3u8 manifest' % self.FD_NAME) self.to_screen('[%s] Downloading m3u8 manifest' % self.FD_NAME)
manifest = self.ydl.urlopen(man_url).read()
manifest = self.ydl.urlopen(self._prepare_url(info_dict, man_url)).read()
s = manifest.decode('utf-8', 'ignore') s = manifest.decode('utf-8', 'ignore')
@ -112,7 +113,10 @@ class HlsFD(FragmentFD):
count = 0 count = 0
while count <= fragment_retries: while count <= fragment_retries:
try: try:
success = ctx['dl'].download(frag_filename, {'url': frag_url}) success = ctx['dl'].download(frag_filename, {
'url': frag_url,
'http_headers': info_dict.get('http_headers'),
})
if not success: if not success:
return False return False
down, frag_sanitized = sanitize_open(frag_filename, 'rb') down, frag_sanitized = sanitize_open(frag_filename, 'rb')

View File

@ -26,7 +26,7 @@ class AENetworksIE(AENetworksBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)/full-movie)' _VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)/full-movie)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1', 'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': '8ff93eb073449f151d6b90c0ae1ef0c7', 'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': { 'info_dict': {
'id': '22253814', 'id': '22253814',
'ext': 'mp4', 'ext': 'mp4',
@ -99,7 +99,7 @@ class AENetworksIE(AENetworksBaseIE):
query = { query = {
'mbr': 'true', 'mbr': 'true',
'assetTypes': 'medium_video_s3' 'assetTypes': 'high_video_s3'
} }
video_id = self._html_search_meta('aetn:VideoID', webpage) video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex( media_url = self._search_regex(
@ -155,7 +155,7 @@ class HistoryTopicIE(AENetworksBaseIE):
'id': 'world-war-i-history', 'id': 'world-war-i-history',
'title': 'World War I History', 'title': 'World War I History',
}, },
'playlist_mincount': 24, 'playlist_mincount': 23,
}, { }, {
'url': 'http://www.history.com/topics/world-war-i-history/videos', 'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True, 'only_matching': True,
@ -193,7 +193,8 @@ class HistoryTopicIE(AENetworksBaseIE):
return self.theplatform_url_result( return self.theplatform_url_result(
release_url, video_id, { release_url, video_id, {
'mbr': 'true', 'mbr': 'true',
'switch': 'hls' 'switch': 'hls',
'assetTypes': 'high_video_ak',
}) })
else: else:
webpage = self._download_webpage(url, topic_id) webpage = self._download_webpage(url, topic_id)
@ -203,6 +204,7 @@ class HistoryTopicIE(AENetworksBaseIE):
entries.append(self.theplatform_url_result( entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], { video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true', 'mbr': 'true',
'switch': 'hls' 'switch': 'hls',
'assetTypes': 'high_video_ak',
})) }))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage)) return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))

View File

@ -11,6 +11,7 @@ from ..compat import (
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
update_url_query,
xpath_element, xpath_element,
xpath_text, xpath_text,
) )
@ -18,12 +19,18 @@ from ..utils import (
class AfreecaTVIE(InfoExtractor): class AfreecaTVIE(InfoExtractor):
IE_DESC = 'afreecatv.com' IE_DESC = 'afreecatv.com'
_VALID_URL = r'''(?x)^ _VALID_URL = r'''(?x)
https?://(?:(live|afbbs|www)\.)?afreeca(?:tv)?\.com(?::\d+)? https?://
(?: (?:
/app/(?:index|read_ucc_bbs)\.cgi| (?:(?:live|afbbs|www)\.)?afreeca(?:tv)?\.com(?::\d+)?
/player/[Pp]layer\.(?:swf|html)) (?:
\?.*?\bnTitleNo=(?P<id>\d+)''' /app/(?:index|read_ucc_bbs)\.cgi|
/player/[Pp]layer\.(?:swf|html)
)\?.*?\bnTitleNo=|
vod\.afreecatv\.com/PLAYER/STATION/
)
(?P<id>\d+)
'''
_TESTS = [{ _TESTS = [{
'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=', 'url': 'http://live.afreecatv.com:8079/app/index.cgi?szType=read_ucc_bbs&szBjId=dailyapril&nStationNo=16711924&nBbsNo=18605867&nTitleNo=36164052&szSkin=',
'md5': 'f72c89fe7ecc14c1b5ce506c4996046e', 'md5': 'f72c89fe7ecc14c1b5ce506c4996046e',
@ -66,6 +73,9 @@ class AfreecaTVIE(InfoExtractor):
}, { }, {
'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652', 'url': 'http://www.afreecatv.com/player/Player.swf?szType=szBjId=djleegoon&nStationNo=11273158&nBbsNo=13161095&nTitleNo=36327652',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://vod.afreecatv.com/PLAYER/STATION/15055030',
'only_matching': True,
}] }]
@staticmethod @staticmethod
@ -83,7 +93,9 @@ class AfreecaTVIE(InfoExtractor):
info_url = compat_urlparse.urlunparse(parsed_url._replace( info_url = compat_urlparse.urlunparse(parsed_url._replace(
netloc='afbbs.afreecatv.com:8080', netloc='afbbs.afreecatv.com:8080',
path='/api/video/get_video_info.php')) path='/api/video/get_video_info.php'))
video_xml = self._download_xml(info_url, video_id)
video_xml = self._download_xml(
update_url_query(info_url, {'nTitleNo': video_id}), video_id)
if xpath_element(video_xml, './track/video/file') is None: if xpath_element(video_xml, './track/video/file') is None:
raise ExtractorError('Specified AfreecaTV video does not exist', raise ExtractorError('Specified AfreecaTV video does not exist',

View File

@ -10,7 +10,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE): class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?season-\d+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1', 'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '', 'md5': '',
@ -41,6 +41,9 @@ class AMCNetworksIE(ThePlatformIE):
}, { }, {
'url': 'http://www.ifc.com/movies/chaos', 'url': 'http://www.ifc.com/movies/chaos',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -11,7 +11,7 @@ from ..utils import (
class AzubuIE(InfoExtractor): class AzubuIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/[^/]+#!/play/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/[^/]+#!/play/(?P<id>\d+)'
_TESTS = [ _TESTS = [
{ {
'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1', 'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1',
@ -103,12 +103,15 @@ class AzubuIE(InfoExtractor):
class AzubuLiveIE(InfoExtractor): class AzubuLiveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/(?P<id>[^/]+)$' _VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/(?P<id>[^/]+)$'
_TEST = { _TESTS = [{
'url': 'http://www.azubu.tv/MarsTVMDLen', 'url': 'http://www.azubu.tv/MarsTVMDLen',
'only_matching': True, 'only_matching': True,
} }, {
'url': 'http://azubu.uol.com.br/adolfz',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
user = self._match_id(url) user = self._match_id(url)

View File

@ -1,7 +1,9 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import json import json
import random
import re import re
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
@ -12,6 +14,9 @@ from ..utils import (
ExtractorError, ExtractorError,
float_or_none, float_or_none,
int_or_none, int_or_none,
parse_filesize,
unescapeHTML,
update_url_query,
) )
@ -81,35 +86,68 @@ class BandcampIE(InfoExtractor):
r'(?ms)var TralbumData = .*?[{,]\s*id: (?P<id>\d+),?$', r'(?ms)var TralbumData = .*?[{,]\s*id: (?P<id>\d+),?$',
webpage, 'video id') webpage, 'video id')
download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page') download_webpage = self._download_webpage(
# We get the dictionary of the track from some javascript code download_link, video_id, 'Downloading free downloads page')
all_info = self._parse_json(self._search_regex(
r'(?sm)items: (.*?),$', download_webpage, 'items'), video_id) blob = self._parse_json(
info = all_info[0] self._search_regex(
# We pick mp3-320 for now, until format selection can be easily implemented. r'data-blob=(["\'])(?P<blob>{.+?})\1', download_webpage,
mp3_info = info['downloads']['mp3-320'] 'blob', group='blob'),
# If we try to use this url it says the link has expired video_id, transform_source=unescapeHTML)
initial_url = mp3_info['url']
m_url = re.match( info = blob['digital_items'][0]
r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$',
initial_url) downloads = info['downloads']
# We build the url we will use to get the final track url track = info['title']
# This url is build in Bandcamp in the script download_bunde_*.js
request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), video_id, m_url.group('ts')) artist = info.get('artist')
final_url_webpage = self._download_webpage(request_url, video_id, 'Requesting download url') title = '%s - %s' % (artist, track) if artist else track
# If we could correctly generate the .rand field the url would be
# in the "download_url" key download_formats = {}
final_url = self._proto_relative_url(self._search_regex( for f in blob['download_formats']:
r'"retry_url":"(.+?)"', final_url_webpage, 'final video URL'), 'http:') name, ext = f.get('name'), f.get('file_extension')
if all(isinstance(x, compat_str) for x in (name, ext)):
download_formats[name] = ext.strip('.')
formats = []
for format_id, f in downloads.items():
format_url = f.get('url')
if not format_url:
continue
# Stat URL generation algorithm is reverse engineered from
# download_*_bundle_*.js
stat_url = update_url_query(
format_url.replace('/download/', '/statdownload/'), {
'.rand': int(time.time() * 1000 * random.random()),
})
format_id = f.get('encoding_name') or format_id
stat = self._download_json(
stat_url, video_id, 'Downloading %s JSON' % format_id,
transform_source=lambda s: s[s.index('{'):s.rindex('}') + 1],
fatal=False)
if not stat:
continue
retry_url = stat.get('retry_url')
if not isinstance(retry_url, compat_str):
continue
formats.append({
'url': self._proto_relative_url(retry_url, 'http:'),
'ext': download_formats.get(format_id),
'format_id': format_id,
'format_note': f.get('description'),
'filesize': parse_filesize(f.get('size_mb')),
'vcodec': 'none',
})
self._sort_formats(formats)
return { return {
'id': video_id, 'id': video_id,
'title': info['title'], 'title': title,
'ext': 'mp3',
'vcodec': 'none',
'url': final_url,
'thumbnail': info.get('thumb_url'), 'thumbnail': info.get('thumb_url'),
'uploader': info.get('artist'), 'uploader': info.get('artist'),
'artist': artist,
'track': track,
'formats': formats,
} }

View File

@ -45,7 +45,8 @@ class BloombergIE(InfoExtractor):
name = self._match_id(url) name = self._match_id(url)
webpage = self._download_webpage(url, name) webpage = self._download_webpage(url, name)
video_id = self._search_regex( video_id = self._search_regex(
r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1', (r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
r'videoId\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1'),
webpage, 'id', group='url', default=None) webpage, 'id', group='url', default=None)
if not video_id: if not video_id:
bplayer_data = self._parse_json(self._search_regex( bplayer_data = self._parse_json(self._search_regex(

View File

@ -4,11 +4,14 @@ from __future__ import unicode_literals
from .anvato import AnvatoIE from .anvato import AnvatoIE
from .sendtonews import SendtoNewsIE from .sendtonews import SendtoNewsIE
from ..compat import compat_urlparse from ..compat import compat_urlparse
from ..utils import unified_timestamp from ..utils import (
parse_iso8601,
unified_timestamp,
)
class CBSLocalIE(AnvatoIE): class CBSLocalIE(AnvatoIE):
_VALID_URL = r'https?://[a-z]+\.cbslocal\.com/\d+/\d+/\d+/(?P<id>[0-9a-z-]+)' _VALID_URL = r'https?://[a-z]+\.cbslocal\.com/(?:\d+/\d+/\d+|video)/(?P<id>[0-9a-z-]+)'
_TESTS = [{ _TESTS = [{
# Anvato backend # Anvato backend
@ -49,6 +52,31 @@ class CBSLocalIE(AnvatoIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'http://newyork.cbslocal.com/video/3580809-a-very-blue-anniversary/',
'info_dict': {
'id': '3580809',
'ext': 'mp4',
'title': 'A Very Blue Anniversary',
'description': 'CBS2s Cindy Hsu has more.',
'thumbnail': 're:^https?://.*',
'timestamp': 1479962220,
'upload_date': '20161124',
'uploader': 'CBS',
'subtitles': {
'en': 'mincount:5',
},
'categories': [
'Stations\\Spoken Word\\WCBSTV',
'Syndication\\AOL',
'Syndication\\MSN',
'Syndication\\NDN',
'Syndication\\Yahoo',
'Content\\News',
'Content\\News\\Local News',
],
'tags': ['CBS 2 News Weekends', 'Cindy Hsu', 'Blue Man Group'],
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -64,8 +92,11 @@ class CBSLocalIE(AnvatoIE):
info_dict = self._extract_anvato_videos(webpage, display_id) info_dict = self._extract_anvato_videos(webpage, display_id)
time_str = self._html_search_regex( time_str = self._html_search_regex(
r'class="entry-date">([^<]+)<', webpage, 'released date', fatal=False) r'class="entry-date">([^<]+)<', webpage, 'released date', default=None)
timestamp = unified_timestamp(time_str) if time_str:
timestamp = unified_timestamp(time_str)
else:
timestamp = parse_iso8601(self._html_search_meta('uploadDate', webpage))
info_dict.update({ info_dict.update({
'display_id': display_id, 'display_id': display_id,

View File

@ -5,14 +5,16 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
decode_packed_codes,
ExtractorError, ExtractorError,
parse_duration float_or_none,
int_or_none,
parse_duration,
) )
class CDAIE(InfoExtractor): class CDAIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)' _VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)'
_BASE_URL = 'http://www.cda.pl/'
_TESTS = [{ _TESTS = [{
'url': 'http://www.cda.pl/video/5749950c', 'url': 'http://www.cda.pl/video/5749950c',
'md5': '6f844bf51b15f31fae165365707ae970', 'md5': '6f844bf51b15f31fae165365707ae970',
@ -21,6 +23,9 @@ class CDAIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'height': 720, 'height': 720,
'title': 'Oto dlaczego przed zakrętem należy zwolnić.', 'title': 'Oto dlaczego przed zakrętem należy zwolnić.',
'description': 'md5:269ccd135d550da90d1662651fcb9772',
'thumbnail': 're:^https?://.*\.jpg$',
'average_rating': float,
'duration': 39 'duration': 39
} }
}, { }, {
@ -30,6 +35,11 @@ class CDAIE(InfoExtractor):
'id': '57413289', 'id': '57413289',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Lądowanie na lotnisku na Maderze', 'title': 'Lądowanie na lotnisku na Maderze',
'description': 'md5:60d76b71186dcce4e0ba6d4bbdb13e1a',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'crash404',
'view_count': int,
'average_rating': float,
'duration': 137 'duration': 137
} }
}, { }, {
@ -39,31 +49,55 @@ class CDAIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage('http://ebd.cda.pl/0x0/' + video_id, video_id) self._set_cookie('cda.pl', 'cda.player', 'html5')
webpage = self._download_webpage(
self._BASE_URL + '/video/' + video_id, video_id)
if 'Ten film jest dostępny dla użytkowników premium' in webpage: if 'Ten film jest dostępny dla użytkowników premium' in webpage:
raise ExtractorError('This video is only available for premium users.', expected=True) raise ExtractorError('This video is only available for premium users.', expected=True)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
formats = [] formats = []
uploader = self._search_regex(r'''(?x)
<(span|meta)[^>]+itemprop=(["\'])author\2[^>]*>
(?:<\1[^>]*>[^<]*</\1>|(?!</\1>)(?:.|\n))*?
<(span|meta)[^>]+itemprop=(["\'])name\4[^>]*>(?P<uploader>[^<]+)</\3>
''', webpage, 'uploader', default=None, group='uploader')
view_count = self._search_regex(
r'Odsłony:(?:\s|&nbsp;)*([0-9]+)', webpage,
'view_count', default=None)
average_rating = self._search_regex(
r'<(?:span|meta)[^>]+itemprop=(["\'])ratingValue\1[^>]*>(?P<rating_value>[0-9.]+)',
webpage, 'rating', fatal=False, group='rating_value')
info_dict = { info_dict = {
'id': video_id, 'id': video_id,
'title': title, 'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage),
'uploader': uploader,
'view_count': int_or_none(view_count),
'average_rating': float_or_none(average_rating),
'thumbnail': self._og_search_thumbnail(webpage),
'formats': formats, 'formats': formats,
'duration': None, 'duration': None,
} }
def extract_format(page, version): def extract_format(page, version):
unpacked = decode_packed_codes(page) json_str = self._search_regex(
format_url = self._search_regex( r'player_data=(\\?["\'])(?P<player_data>.+?)\1', page,
r"(?:file|url)\s*:\s*(\\?[\"'])(?P<url>http.+?)\1", unpacked, '%s player_json' % version, fatal=False, group='player_data')
'%s url' % version, fatal=False, group='url') if not json_str:
if not format_url: return
player_data = self._parse_json(
json_str, '%s player_data' % version, fatal=False)
if not player_data:
return
video = player_data.get('video')
if not video or 'file' not in video:
self.report_warning('Unable to extract %s version information' % version)
return return
f = { f = {
'url': format_url, 'url': video['file'],
} }
m = re.search( m = re.search(
r'<a[^>]+data-quality="(?P<format_id>[^"]+)"[^>]+href="[^"]+"[^>]+class="[^"]*quality-btn-active[^"]*">(?P<height>[0-9]+)p', r'<a[^>]+data-quality="(?P<format_id>[^"]+)"[^>]+href="[^"]+"[^>]+class="[^"]*quality-btn-active[^"]*">(?P<height>[0-9]+)p',
@ -75,9 +109,7 @@ class CDAIE(InfoExtractor):
}) })
info_dict['formats'].append(f) info_dict['formats'].append(f)
if not info_dict['duration']: if not info_dict['duration']:
info_dict['duration'] = parse_duration(self._search_regex( info_dict['duration'] = parse_duration(video.get('duration'))
r"duration\s*:\s*(\\?[\"'])(?P<duration>.+?)\1",
unpacked, 'duration', fatal=False, group='duration'))
extract_format(webpage, 'default') extract_format(webpage, 'default')
@ -85,7 +117,8 @@ class CDAIE(InfoExtractor):
r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)', r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)',
webpage): webpage):
webpage = self._download_webpage( webpage = self._download_webpage(
href, video_id, 'Downloading %s version information' % resolution, fatal=False) self._BASE_URL + href, video_id,
'Downloading %s version information' % resolution, fatal=False)
if not webpage: if not webpage:
# Manually report warning because empty page is returned when # Manually report warning because empty page is returned when
# invalid version is requested. # invalid version is requested.

View File

@ -6,7 +6,7 @@ from .common import InfoExtractor
class ComedyCentralIE(MTVServicesInfoExtractor): class ComedyCentralIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/ _VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(video-clips|episodes|cc-studios|video-collections|full-episodes|shows) (video-clips|episodes|cc-studios|video-collections|shows(?=/[^/]+/(?!full-episodes)))
/(?P<title>.*)''' /(?P<title>.*)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/' _FEED_URL = 'http://comedycentral.com/feeds/mrss/'
@ -27,6 +27,40 @@ class ComedyCentralIE(MTVServicesInfoExtractor):
}] }]
class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(?:full-episodes|shows(?=/[^/]+/full-episodes))
/(?P<id>[^?]+)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TESTS = [{
'url': 'http://www.cc.com/full-episodes/pv391a/the-daily-show-with-trevor-noah-november-28--2016---ryan-speedo-green-season-22-ep-22028',
'info_dict': {
'description': 'Donald Trump is accused of exploiting his president-elect status for personal gain, Cuban leader Fidel Castro dies, and Ryan Speedo Green discusses "Sing for Your Life."',
'title': 'November 28, 2016 - Ryan Speedo Green',
},
'playlist_count': 4,
}, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'only_matching': True,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
feed_json = self._search_regex(r'var triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage, 'triforce feeed')
feed = self._parse_json(feed_json, playlist_id)
zones = feed['manifest']['zones']
video_zone = zones['t2_lc_promo1']
feed = self._download_json(video_zone['feed'], playlist_id)
mgid = feed['result']['data']['id']
videos_info = self._get_videos_info(mgid)
return videos_info
class ToshIE(MTVServicesInfoExtractor): class ToshIE(MTVServicesInfoExtractor):
IE_DESC = 'Tosh.0' IE_DESC = 'Tosh.0'
_VALID_URL = r'^https?://tosh\.cc\.com/video-(?:clips|collections)/[^/]+/(?P<videotitle>[^/?#]+)' _VALID_URL = r'^https?://tosh\.cc\.com/video-(?:clips|collections)/[^/]+/(?P<videotitle>[^/?#]+)'

View File

@ -886,7 +886,7 @@ class InfoExtractor(object):
'url': e.get('contentUrl'), 'url': e.get('contentUrl'),
'title': unescapeHTML(e.get('name')), 'title': unescapeHTML(e.get('name')),
'description': unescapeHTML(e.get('description')), 'description': unescapeHTML(e.get('description')),
'thumbnail': e.get('thumbnailUrl'), 'thumbnail': e.get('thumbnailUrl') or e.get('thumbnailURL'),
'duration': parse_duration(e.get('duration')), 'duration': parse_duration(e.get('duration')),
'timestamp': unified_timestamp(e.get('uploadDate')), 'timestamp': unified_timestamp(e.get('uploadDate')),
'filesize': float_or_none(e.get('contentSize')), 'filesize': float_or_none(e.get('contentSize')),
@ -1703,7 +1703,7 @@ class InfoExtractor(object):
representation_ms_info['fragments'] = [{ representation_ms_info['fragments'] = [{
'url': media_template % { 'url': media_template % {
'Number': segment_number, 'Number': segment_number,
'Bandwidth': representation_attrib.get('bandwidth'), 'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
}, },
'duration': segment_duration, 'duration': segment_duration,
} for segment_number in range( } for segment_number in range(
@ -1721,7 +1721,7 @@ class InfoExtractor(object):
def add_segment_url(): def add_segment_url():
segment_url = media_template % { segment_url = media_template % {
'Time': segment_time, 'Time': segment_time,
'Bandwidth': representation_attrib.get('bandwidth'), 'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
'Number': segment_number, 'Number': segment_number,
} }
representation_ms_info['fragments'].append({ representation_ms_info['fragments'].append({

View File

@ -236,7 +236,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
output += 'WrapStyle: %s\n' % sub_root.attrib['wrap_style'] output += 'WrapStyle: %s\n' % sub_root.attrib['wrap_style']
output += 'PlayResX: %s\n' % sub_root.attrib['play_res_x'] output += 'PlayResX: %s\n' % sub_root.attrib['play_res_x']
output += 'PlayResY: %s\n' % sub_root.attrib['play_res_y'] output += 'PlayResY: %s\n' % sub_root.attrib['play_res_y']
output += """ScaledBorderAndShadow: yes output += """ScaledBorderAndShadow: no
[V4+ Styles] [V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding

View File

@ -180,6 +180,7 @@ from .cnn import (
from .coub import CoubIE from .coub import CoubIE
from .collegerama import CollegeRamaIE from .collegerama import CollegeRamaIE
from .comedycentral import ( from .comedycentral import (
ComedyCentralFullEpisodesIE,
ComedyCentralIE, ComedyCentralIE,
ComedyCentralShortnameIE, ComedyCentralShortnameIE,
ComedyCentralTVIE, ComedyCentralTVIE,
@ -804,7 +805,6 @@ from .scivee import SciVeeIE
from .screencast import ScreencastIE from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE from .screencastomatic import ScreencastOMaticIE
from .screenjunkies import ScreenJunkiesIE from .screenjunkies import ScreenJunkiesIE
from .screenwavemedia import ScreenwaveMediaIE, TeamFourIE
from .seeker import SeekerIE from .seeker import SeekerIE
from .senateisvp import SenateISVPIE from .senateisvp import SenateISVPIE
from .sendtonews import SendtoNewsIE from .sendtonews import SendtoNewsIE
@ -897,6 +897,7 @@ from .teachertube import (
) )
from .teachingchannel import TeachingChannelIE from .teachingchannel import TeachingChannelIE
from .teamcoco import TeamcocoIE from .teamcoco import TeamcocoIE
from .teamfourstar import TeamFourStarIE
from .techtalks import TechTalksIE from .techtalks import TechTalksIE
from .ted import TEDIE from .ted import TEDIE
from .tele13 import Tele13IE from .tele13 import Tele13IE
@ -965,6 +966,10 @@ from .tv2 import (
) )
from .tv3 import TV3IE from .tv3 import TV3IE
from .tv4 import TV4IE from .tv4 import TV4IE
from .tvanouvelles import (
TVANouvellesIE,
TVANouvellesArticleIE,
)
from .tvc import ( from .tvc import (
TVCIE, TVCIE,
TVCArticleIE, TVCArticleIE,
@ -1117,6 +1122,10 @@ from .wdr import (
WDRIE, WDRIE,
WDRMobileIE, WDRMobileIE,
) )
from .webcaster import (
WebcasterIE,
WebcasterFeedIE,
)
from .webofstories import ( from .webofstories import (
WebOfStoriesIE, WebOfStoriesIE,
WebOfStoriesPlaylistIE, WebOfStoriesPlaylistIE,

View File

@ -2,7 +2,10 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_urlparse from ..compat import (
compat_str,
compat_urlparse,
)
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
qualities, qualities,
@ -22,8 +25,7 @@ class FirstTVIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '40049', 'id': '40049',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015', 'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015',
'description': 'md5:36a39c1d19618fec57d12efe212a8370',
'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$', 'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20150212', 'upload_date': '20150212',
'duration': 2694, 'duration': 2694,
@ -34,8 +36,7 @@ class FirstTVIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '364746', 'id': '364746',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016', 'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016',
'description': 'md5:a242eea0031fd180a4497d52640a9572',
'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$', 'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20160407', 'upload_date': '20160407',
'duration': 179, 'duration': 179,
@ -44,6 +45,17 @@ class FirstTVIE(InfoExtractor):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'http://www.1tv.ru/news/issue/2016-12-01/14:00',
'info_dict': {
'id': '14:00',
'title': 'Выпуск новостей в 14:00 1 декабря 2016 года. Новости. Первый канал',
'description': 'md5:2e921b948f8c1ff93901da78ebdb1dfd',
},
'playlist_count': 13,
}, {
'url': 'http://www.1tv.ru/shows/tochvtoch-supersezon/vystupleniya/evgeniy-dyatlov-vladimir-vysockiy-koni-priveredlivye-toch-v-toch-supersezon-fragment-vypuska-ot-06-11-2016',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -51,43 +63,66 @@ class FirstTVIE(InfoExtractor):
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
playlist_url = compat_urlparse.urljoin(url, self._search_regex( playlist_url = compat_urlparse.urljoin(url, self._search_regex(
r'data-playlist-url="([^"]+)', webpage, 'playlist url')) r'data-playlist-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'playlist url', group='url'))
item = self._download_json(playlist_url, display_id)[0] parsed_url = compat_urlparse.urlparse(playlist_url)
video_id = item['id'] qs = compat_urlparse.parse_qs(parsed_url.query)
quality = qualities(('ld', 'sd', 'hd', )) item_ids = qs.get('videos_ids[]') or qs.get('news_ids[]')
formats = []
for f in item.get('mbr', []): items = self._download_json(playlist_url, display_id)
src = f.get('src')
if not src: if item_ids:
continue items = [
fname = f.get('name') item for item in items
formats.append({ if item.get('uid') and compat_str(item['uid']) in item_ids]
'url': src, else:
'format_id': fname, items = [items[0]]
'quality': quality(fname),
entries = []
QUALITIES = ('ld', 'sd', 'hd', )
for item in items:
title = item['title']
quality = qualities(QUALITIES)
formats = []
for f in item.get('mbr', []):
src = f.get('src')
if not src or not isinstance(src, compat_str):
continue
tbr = int_or_none(self._search_regex(
r'_(\d{3,})\.mp4', src, 'tbr', default=None))
formats.append({
'url': src,
'format_id': f.get('name'),
'tbr': tbr,
'quality': quality(f.get('name')),
})
self._sort_formats(formats)
thumbnail = item.get('poster') or self._og_search_thumbnail(webpage)
duration = int_or_none(item.get('duration') or self._html_search_meta(
'video:duration', webpage, 'video duration', fatal=False))
upload_date = unified_strdate(self._html_search_meta(
'ya:ovs:upload_date', webpage, 'upload date', default=None))
entries.append({
'id': compat_str(item.get('id') or item['uid']),
'thumbnail': thumbnail,
'title': title,
'upload_date': upload_date,
'duration': int_or_none(duration),
'formats': formats
}) })
self._sort_formats(formats)
title = self._html_search_regex( title = self._html_search_regex(
(r'<div class="tv_translation">\s*<h1><a href="[^"]+">([^<]*)</a>', (r'<div class="tv_translation">\s*<h1><a href="[^"]+">([^<]*)</a>',
r"'title'\s*:\s*'([^']+)'"), r"'title'\s*:\s*'([^']+)'"),
webpage, 'title', default=None) or item['title'] webpage, 'title', default=None) or self._og_search_title(
webpage, default=None)
description = self._html_search_regex( description = self._html_search_regex(
r'<div class="descr">\s*<div>&nbsp;</div>\s*<p>([^<]*)</p></div>', r'<div class="descr">\s*<div>&nbsp;</div>\s*<p>([^<]*)</p></div>',
webpage, 'description', default=None) or self._html_search_meta( webpage, 'description', default=None) or self._html_search_meta(
'description', webpage, 'description') 'description', webpage, 'description', default=None)
duration = int_or_none(self._html_search_meta(
'video:duration', webpage, 'video duration', fatal=False))
upload_date = unified_strdate(self._html_search_meta(
'ya:ovs:upload_date', webpage, 'upload date', fatal=False))
return { return self.playlist_result(entries, display_id, title, description)
'id': video_id,
'thumbnail': item.get('poster') or self._og_search_thumbnail(webpage),
'title': title,
'description': description,
'upload_date': upload_date,
'duration': int_or_none(duration),
'formats': formats
}

View File

@ -28,6 +28,9 @@ class FunnyOrDieIE(InfoExtractor):
'description': 'Please use this to sell something. www.jonlajoie.com', 'description': 'Please use this to sell something. www.jonlajoie.com',
'thumbnail': 're:^http:.*\.jpg$', 'thumbnail': 're:^http:.*\.jpg$',
}, },
'params': {
'skip_download': True,
},
}, { }, {
'url': 'http://www.funnyordie.com/articles/ebf5e34fc8/10-hours-of-walking-in-nyc-as-a-man', 'url': 'http://www.funnyordie.com/articles/ebf5e34fc8/10-hours-of-walking-in-nyc-as-a-man',
'only_matching': True, 'only_matching': True,
@ -51,19 +54,45 @@ class FunnyOrDieIE(InfoExtractor):
formats = [] formats = []
formats.extend(self._extract_m3u8_formats( m3u8_formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)) m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False)
source_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
m3u8_formats))
bitrates = [int(bitrate) for bitrate in re.findall(r'[,/]v(\d+)[,/]', m3u8_url)] bitrates = [int(bitrate) for bitrate in re.findall(r'[,/]v(\d+)(?=[,/])', m3u8_url)]
bitrates.sort() bitrates.sort()
for bitrate in bitrates: if source_formats:
for link in links: self._sort_formats(source_formats)
formats.append({
'url': self._proto_relative_url('%s%d.%s' % (link[0], bitrate, link[1])), for bitrate, f in zip(bitrates, source_formats or [{}] * len(bitrates)):
'format_id': '%s-%d' % (link[1], bitrate), for path, ext in links:
'vbr': bitrate, ff = f.copy()
}) if ff:
if ext != 'mp4':
ff = dict(
[(k, v) for k, v in ff.items()
if k in ('height', 'width', 'format_id')])
ff.update({
'format_id': ff['format_id'].replace('hls', ext),
'ext': ext,
'protocol': 'http',
})
else:
ff.update({
'format_id': '%s-%d' % (ext, bitrate),
'vbr': bitrate,
})
ff['url'] = self._proto_relative_url(
'%s%d.%s' % (path, bitrate, ext))
formats.append(ff)
self._check_formats(formats, video_id)
formats.extend(m3u8_formats)
self._sort_formats(
formats, field_preference=('height', 'width', 'tbr', 'format_id'))
subtitles = {} subtitles = {}
for src, src_lang in re.findall(r'<track kind="captions" src="([^"]+)" srclang="([^"]+)"', webpage): for src, src_lang in re.findall(r'<track kind="captions" src="([^"]+)" srclang="([^"]+)"', webpage):

View File

@ -29,7 +29,7 @@ class FusionIE(InfoExtractor):
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
ooyala_code = self._search_regex( ooyala_code = self._search_regex(
r'data-video-id=(["\'])(?P<code>.+?)\1', r'data-ooyala-id=(["\'])(?P<code>(?:(?!\1).)+)\1',
webpage, 'ooyala code', group='code') webpage, 'ooyala code', group='code')
return OoyalaIE._build_url_result(ooyala_code) return OoyalaIE._build_url_result(ooyala_code)

View File

@ -56,10 +56,10 @@ from .dailymotion import (
) )
from .onionstudios import OnionStudiosIE from .onionstudios import OnionStudiosIE
from .viewlift import ViewLiftEmbedIE from .viewlift import ViewLiftEmbedIE
from .screenwavemedia import ScreenwaveMediaIE
from .mtv import MTVServicesEmbeddedIE from .mtv import MTVServicesEmbeddedIE
from .pladform import PladformIE from .pladform import PladformIE
from .videomore import VideomoreIE from .videomore import VideomoreIE
from .webcaster import WebcasterFeedIE
from .googledrive import GoogleDriveIE from .googledrive import GoogleDriveIE
from .jwplatform import JWPlatformIE from .jwplatform import JWPlatformIE
from .digiteka import DigitekaIE from .digiteka import DigitekaIE
@ -1189,16 +1189,6 @@ class GenericIE(InfoExtractor):
'duration': 248.667, 'duration': 248.667,
}, },
}, },
# ScreenwaveMedia embed
{
'url': 'http://www.thecinemasnob.com/the-cinema-snob/a-nightmare-on-elm-street-2-freddys-revenge1',
'md5': '24ace5baba0d35d55c6810b51f34e9e0',
'info_dict': {
'id': 'cinemasnob-55d26273809dd',
'ext': 'mp4',
'title': 'cinemasnob',
},
},
# BrightcoveInPageEmbed embed # BrightcoveInPageEmbed embed
{ {
'url': 'http://www.geekandsundry.com/tabletop-bonus-wils-final-thoughts-on-dread/', 'url': 'http://www.geekandsundry.com/tabletop-bonus-wils-final-thoughts-on-dread/',
@ -2140,6 +2130,11 @@ class GenericIE(InfoExtractor):
if videomore_url: if videomore_url:
return self.url_result(videomore_url) return self.url_result(videomore_url)
# Look for Webcaster embeds
webcaster_url = WebcasterFeedIE._extract_url(self, webpage)
if webcaster_url:
return self.url_result(webcaster_url, ie=WebcasterFeedIE.ie_key())
# Look for Playwire embeds # Look for Playwire embeds
mobj = re.search( mobj = re.search(
r'<script[^>]+data-config=(["\'])(?P<url>(?:https?:)?//config\.playwire\.com/.+?)\1', webpage) r'<script[^>]+data-config=(["\'])(?P<url>(?:https?:)?//config\.playwire\.com/.+?)\1', webpage)
@ -2206,11 +2201,6 @@ class GenericIE(InfoExtractor):
if jwplatform_url: if jwplatform_url:
return self.url_result(jwplatform_url, 'JWPlatform') return self.url_result(jwplatform_url, 'JWPlatform')
# Look for ScreenwaveMedia embeds
mobj = re.search(ScreenwaveMediaIE.EMBED_PATTERN, webpage)
if mobj is not None:
return self.url_result(unescapeHTML(mobj.group('url')), 'ScreenwaveMedia')
# Look for Digiteka embeds # Look for Digiteka embeds
digiteka_url = DigitekaIE._extract_url(webpage) digiteka_url = DigitekaIE._extract_url(webpage)
if digiteka_url: if digiteka_url:
@ -2232,6 +2222,16 @@ class GenericIE(InfoExtractor):
return self.url_result('limelight:%s:%s' % ( return self.url_result('limelight:%s:%s' % (
lm[mobj.group(1)], mobj.group(2)), 'Limelight%s' % mobj.group(1), mobj.group(2)) lm[mobj.group(1)], mobj.group(2)), 'Limelight%s' % mobj.group(1), mobj.group(2))
mobj = re.search(
r'''(?sx)
<object[^>]+class=(["\'])LimelightEmbeddedPlayerFlash\1[^>]*>.*?
<param[^>]+
name=(["\'])flashVars\2[^>]+
value=(["\'])(?:(?!\3).)*mediaId=(?P<id>[a-z0-9]{32})
''', webpage)
if mobj:
return self.url_result('limelight:media:%s' % mobj.group('id'))
# Look for AdobeTVVideo embeds # Look for AdobeTVVideo embeds
mobj = re.search( mobj = re.search(
r'<iframe[^>]+src=[\'"]((?:https?:)?//video\.tv\.adobe\.com/v/\d+[^"]+)[\'"]', r'<iframe[^>]+src=[\'"]((?:https?:)?//video\.tv\.adobe\.com/v/\d+[^"]+)[\'"]',

View File

@ -6,12 +6,13 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
js_to_json, js_to_json,
remove_end, remove_end,
determine_ext,
) )
class HellPornoIE(InfoExtractor): class HellPornoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?hellporno\.com/videos/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?hellporno\.(?:com/videos|net/v)/(?P<id>[^/]+)'
_TEST = { _TESTS = [{
'url': 'http://hellporno.com/videos/dixie-is-posing-with-naked-ass-very-erotic/', 'url': 'http://hellporno.com/videos/dixie-is-posing-with-naked-ass-very-erotic/',
'md5': '1fee339c610d2049699ef2aa699439f1', 'md5': '1fee339c610d2049699ef2aa699439f1',
'info_dict': { 'info_dict': {
@ -22,7 +23,10 @@ class HellPornoIE(InfoExtractor):
'thumbnail': 're:https?://.*\.jpg$', 'thumbnail': 're:https?://.*\.jpg$',
'age_limit': 18, 'age_limit': 18,
} }
} }, {
'url': 'http://hellporno.net/v/186271/',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
@ -38,7 +42,7 @@ class HellPornoIE(InfoExtractor):
video_id = flashvars.get('video_id') video_id = flashvars.get('video_id')
thumbnail = flashvars.get('preview_url') thumbnail = flashvars.get('preview_url')
ext = flashvars.get('postfix', '.mp4')[1:] ext = determine_ext(flashvars.get('postfix'), 'mp4')
formats = [] formats = []
for video_url_key in ['video_url', 'video_alt_url']: for video_url_key in ['video_url', 'video_alt_url']:

View File

@ -54,6 +54,22 @@ class LiveLeakIE(InfoExtractor):
'title': 'Crazy Hungarian tourist films close call waterspout in Croatia', 'title': 'Crazy Hungarian tourist films close call waterspout in Croatia',
'thumbnail': 're:^https?://.*\.jpg$' 'thumbnail': 're:^https?://.*\.jpg$'
} }
}, {
# Covers https://github.com/rg3/youtube-dl/pull/10664#issuecomment-247439521
'url': 'http://m.liveleak.com/view?i=763_1473349649',
'add_ie': ['Youtube'],
'info_dict': {
'id': '763_1473349649',
'ext': 'mp4',
'title': 'Reporters and public officials ignore epidemic of black on asian violence in Sacramento | Colin Flaherty',
'description': 'Colin being the warrior he is and showing the injustice Asians in Sacramento are being subjected to.',
'uploader': 'Ziz',
'upload_date': '20160908',
'uploader_id': 'UCEbta5E_jqlZmEJsriTEtnw'
},
'params': {
'skip_download': True,
},
}] }]
@staticmethod @staticmethod
@ -87,7 +103,7 @@ class LiveLeakIE(InfoExtractor):
else: else:
# Maybe an embed? # Maybe an embed?
embed_url = self._search_regex( embed_url = self._search_regex(
r'<iframe[^>]+src="(http://www.prochan.com/embed\?[^"]+)"', r'<iframe[^>]+src="(https?://(?:www\.)?(?:prochan|youtube)\.com/embed[^"]+)"',
webpage, 'embed URL') webpage, 'embed URL')
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
@ -107,6 +123,7 @@ class LiveLeakIE(InfoExtractor):
'format_note': s.get('label'), 'format_note': s.get('label'),
'url': s['file'], 'url': s['file'],
} for i, s in enumerate(sources)] } for i, s in enumerate(sources)]
for i, s in enumerate(sources): for i, s in enumerate(sources):
# Removing '.h264_*.mp4' gives the raw video, which is essentially # Removing '.h264_*.mp4' gives the raw video, which is essentially
# the same video without the LiveLeak logo at the top (see # the same video without the LiveLeak logo at the top (see

View File

@ -75,7 +75,7 @@ class MiTeleBaseIE(InfoExtractor):
class MiTeleIE(InfoExtractor): class MiTeleIE(InfoExtractor):
IE_DESC = 'mitele.es' IE_DESC = 'mitele.es'
_VALID_URL = r'https?://(?:www\.)?mitele\.es/programas-tv/(?:[^/]+/)(?P<id>[^/]+)/player' _VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/)+(?P<id>[^/]+)/player'
_TESTS = [{ _TESTS = [{
'url': 'http://www.mitele.es/programas-tv/diario-de/57b0dfb9c715da65618b4afa/player', 'url': 'http://www.mitele.es/programas-tv/diario-de/57b0dfb9c715da65618b4afa/player',
@ -86,7 +86,10 @@ class MiTeleIE(InfoExtractor):
'description': 'md5:3b6fce7eaa41b2d97358726378d9369f', 'description': 'md5:3b6fce7eaa41b2d97358726378d9369f',
'series': 'Diario de', 'series': 'Diario de',
'season': 'La redacción', 'season': 'La redacción',
'season_number': 14,
'season_id': 'diario_de_t14_11981',
'episode': 'Programa 144', 'episode': 'Programa 144',
'episode_number': 3,
'thumbnail': 're:(?i)^https?://.*\.jpg$', 'thumbnail': 're:(?i)^https?://.*\.jpg$',
'duration': 2913, 'duration': 2913,
}, },
@ -101,7 +104,10 @@ class MiTeleIE(InfoExtractor):
'description': 'md5:5ff132013f0cd968ffbf1f5f3538a65f', 'description': 'md5:5ff132013f0cd968ffbf1f5f3538a65f',
'series': 'Cuarto Milenio', 'series': 'Cuarto Milenio',
'season': 'Temporada 6', 'season': 'Temporada 6',
'season_number': 6,
'season_id': 'cuarto_milenio_t06_12715',
'episode': 'Programa 226', 'episode': 'Programa 226',
'episode_number': 24,
'thumbnail': 're:(?i)^https?://.*\.jpg$', 'thumbnail': 're:(?i)^https?://.*\.jpg$',
'duration': 7313, 'duration': 7313,
}, },
@ -109,41 +115,77 @@ class MiTeleIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
'add_ie': ['Ooyala'], 'add_ie': ['Ooyala'],
}, {
'url': 'http://www.mitele.es/series-online/la-que-se-avecina/57aac5c1c915da951a8b45ed/player',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
gigya_url = self._search_regex(r'<gigya-api>[^>]*</gigya-api>[^>]*<script\s*src="([^"]*)">[^>]*</script>', webpage, 'gigya', default=None) gigya_url = self._search_regex(
gigya_sc = self._download_webpage(compat_urlparse.urljoin(r'http://www.mitele.es/', gigya_url), video_id, 'Downloading gigya script') r'<gigya-api>[^>]*</gigya-api>[^>]*<script\s+src="([^"]*)">[^>]*</script>',
webpage, 'gigya', default=None)
gigya_sc = self._download_webpage(
compat_urlparse.urljoin('http://www.mitele.es/', gigya_url),
video_id, 'Downloading gigya script')
# Get a appKey/uuid for getting the session key # Get a appKey/uuid for getting the session key
appKey_var = self._search_regex(r'value\("appGridApplicationKey",([0-9a-f]+)\)', gigya_sc, 'appKey variable') appKey_var = self._search_regex(
appKey = self._search_regex(r'var %s="([0-9a-f]+)"' % appKey_var, gigya_sc, 'appKey') r'value\s*\(\s*["\']appGridApplicationKey["\']\s*,\s*([0-9a-f]+)',
uid = compat_str(uuid.uuid4()) gigya_sc, 'appKey variable')
session_url = 'https://appgrid-api.cloud.accedo.tv/session?appKey=%s&uuid=%s' % (appKey, uid) appKey = self._search_regex(
session_json = self._download_json(session_url, video_id, 'Downloading session keys') r'var\s+%s\s*=\s*["\']([0-9a-f]+)' % appKey_var, gigya_sc, 'appKey')
sessionKey = compat_str(session_json['sessionKey'])
session_json = self._download_json(
'https://appgrid-api.cloud.accedo.tv/session',
video_id, 'Downloading session keys', query={
'appKey': appKey,
'uuid': compat_str(uuid.uuid4()),
})
paths = self._download_json(
'https://appgrid-api.cloud.accedo.tv/metadata/general_configuration,%20web_configuration',
video_id, 'Downloading paths JSON',
query={'sessionKey': compat_str(session_json['sessionKey'])})
paths_url = 'https://appgrid-api.cloud.accedo.tv/metadata/general_configuration,%20web_configuration?sessionKey=' + sessionKey
paths = self._download_json(paths_url, video_id, 'Downloading paths JSON')
ooyala_s = paths['general_configuration']['api_configuration']['ooyala_search'] ooyala_s = paths['general_configuration']['api_configuration']['ooyala_search']
data_p = ( source = self._download_json(
'http://' + ooyala_s['base_url'] + ooyala_s['full_path'] + ooyala_s['provider_id'] + 'http://%s%s%s/docs/%s' % (
'/docs/' + video_id + '?include_titles=Series,Season&product_name=test&format=full') ooyala_s['base_url'], ooyala_s['full_path'],
data = self._download_json(data_p, video_id, 'Downloading data JSON') ooyala_s['provider_id'], video_id),
source = data['hits']['hits'][0]['_source'] video_id, 'Downloading data JSON', query={
embedCode = source['offers'][0]['embed_codes'][0] 'include_titles': 'Series,Season',
'product_name': 'test',
'format': 'full',
})['hits']['hits'][0]['_source']
embedCode = source['offers'][0]['embed_codes'][0]
titles = source['localizable_titles'][0] titles = source['localizable_titles'][0]
title = titles.get('title_medium') or titles['title_long'] title = titles.get('title_medium') or titles['title_long']
episode = titles['title_sort_name']
description = titles['summary_long'] description = titles.get('summary_long') or titles.get('summary_medium')
titles_series = source['localizable_titles_series'][0]
series = titles_series['title_long'] def get(key1, key2):
titles_season = source['localizable_titles_season'][0] value1 = source.get(key1)
season = titles_season['title_medium'] if not value1 or not isinstance(value1, list):
duration = parse_duration(source['videos'][0]['duration']) return
if not isinstance(value1[0], dict):
return
return value1[0].get(key2)
series = get('localizable_titles_series', 'title_medium')
season = get('localizable_titles_season', 'title_medium')
season_number = int_or_none(source.get('season_number'))
season_id = source.get('season_id')
episode = titles.get('title_sort_name')
episode_number = int_or_none(source.get('episode_number'))
duration = parse_duration(get('videos', 'duration'))
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
@ -154,7 +196,10 @@ class MiTeleIE(InfoExtractor):
'description': description, 'description': description,
'series': series, 'series': series,
'season': season, 'season': season,
'season_number': season_number,
'season_id': season_id,
'episode': episode, 'episode': episode,
'episode_number': episode_number,
'duration': duration, 'duration': duration,
'thumbnail': source['images'][0]['url'], 'thumbnail': get('images', 'url'),
} }

View File

@ -13,6 +13,7 @@ from ..utils import (
fix_xml_ampersands, fix_xml_ampersands,
float_or_none, float_or_none,
HEADRequest, HEADRequest,
NO_DEFAULT,
RegexNotFoundError, RegexNotFoundError,
sanitized_Request, sanitized_Request,
strip_or_none, strip_or_none,
@ -201,7 +202,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
[self._get_video_info(item) for item in idoc.findall('.//item')], [self._get_video_info(item) for item in idoc.findall('.//item')],
playlist_title=title, playlist_description=description) playlist_title=title, playlist_description=description)
def _extract_mgid(self, webpage): def _extract_mgid(self, webpage, default=NO_DEFAULT):
try: try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf # the url can be http://media.mtvnservices.com/fb/{mgid}.swf
# or http://media.mtvnservices.com/{mgid} # or http://media.mtvnservices.com/{mgid}
@ -221,7 +222,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
sm4_embed = self._html_search_meta( sm4_embed = self._html_search_meta(
'sm4:video:embed', webpage, 'sm4 embed', default='') 'sm4:video:embed', webpage, 'sm4 embed', default='')
mgid = self._search_regex( mgid = self._search_regex(
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid') r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=default)
return mgid return mgid
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -2,7 +2,7 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from .screenwavemedia import ScreenwaveMediaIE from .jwplatform import JWPlatformIE
from ..utils import ( from ..utils import (
unified_strdate, unified_strdate,
@ -25,7 +25,7 @@ class NormalbootsIE(InfoExtractor):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
'add_ie': ['ScreenwaveMedia'], 'add_ie': ['JWPlatform'],
} }
def _real_extract(self, url): def _real_extract(self, url):
@ -39,15 +39,13 @@ class NormalbootsIE(InfoExtractor):
r'<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>', r'<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
webpage, 'date', fatal=False)) webpage, 'date', fatal=False))
screenwavemedia_url = self._html_search_regex( jwplatform_url = JWPlatformIE._extract_url(webpage)
ScreenwaveMediaIE.EMBED_PATTERN, webpage, 'screenwave URL',
group='url')
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'id': video_id, 'id': video_id,
'url': screenwavemedia_url, 'url': jwplatform_url,
'ie_key': ScreenwaveMediaIE.ie_key(), 'ie_key': JWPlatformIE.ie_key(),
'title': self._og_search_title(webpage), 'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage), 'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': self._og_search_thumbnail(webpage),

View File

@ -1,6 +1,7 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import random
import re import re
from .common import InfoExtractor from .common import InfoExtractor
@ -14,6 +15,25 @@ from ..utils import (
class NRKBaseIE(InfoExtractor): class NRKBaseIE(InfoExtractor):
_faked_ip = None
def _download_webpage_handle(self, *args, **kwargs):
# NRK checks X-Forwarded-For HTTP header in order to figure out the
# origin of the client behind proxy. This allows to bypass geo
# restriction by faking this header's value to some Norway IP.
# We will do so once we encounter any geo restriction error.
if self._faked_ip:
# NB: str is intentional
kwargs.setdefault(str('headers'), {})['X-Forwarded-For'] = self._faked_ip
return super(NRKBaseIE, self)._download_webpage_handle(*args, **kwargs)
def _fake_ip(self):
# Use fake IP from 37.191.128.0/17 in order to workaround geo
# restriction
def octet(lb=0, ub=255):
return random.randint(lb, ub)
self._faked_ip = '37.191.%d.%d' % (octet(128), octet())
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
@ -24,6 +44,8 @@ class NRKBaseIE(InfoExtractor):
title = data.get('fullTitle') or data.get('mainTitle') or data['title'] title = data.get('fullTitle') or data.get('mainTitle') or data['title']
video_id = data.get('id') or video_id video_id = data.get('id') or video_id
http_headers = {'X-Forwarded-For': self._faked_ip} if self._faked_ip else {}
entries = [] entries = []
media_assets = data.get('mediaAssets') media_assets = data.get('mediaAssets')
@ -54,6 +76,7 @@ class NRKBaseIE(InfoExtractor):
'duration': duration, 'duration': duration,
'subtitles': subtitles, 'subtitles': subtitles,
'formats': formats, 'formats': formats,
'http_headers': http_headers,
}) })
if not entries: if not entries:
@ -70,10 +93,23 @@ class NRKBaseIE(InfoExtractor):
}] }]
if not entries: if not entries:
if data.get('usageRights', {}).get('isGeoBlocked'): message_type = data.get('messageType', '')
raise ExtractorError( # Can be ProgramIsGeoBlocked or ChannelIsGeoBlocked*
'NRK har ikke rettigheter til å vise dette programmet utenfor Norge', if 'IsGeoBlocked' in message_type and not self._faked_ip:
expected=True) self.report_warning(
'Video is geo restricted, trying to fake IP')
self._fake_ip()
return self._real_extract(url)
MESSAGES = {
'ProgramRightsAreNotReady': 'Du kan dessverre ikke se eller høre programmet',
'ProgramRightsHasExpired': 'Programmet har gått ut',
'ProgramIsGeoBlocked': 'NRK har ikke rettigheter til å vise dette programmet utenfor Norge',
}
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, MESSAGES.get(
message_type, message_type)),
expected=True)
conviva = data.get('convivaStatistics') or {} conviva = data.get('convivaStatistics') or {}
series = conviva.get('seriesName') or data.get('seriesTitle') series = conviva.get('seriesName') or data.get('seriesTitle')

View File

@ -1,6 +1,8 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals, division from __future__ import unicode_literals, division
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_chr, compat_chr,
@ -10,6 +12,10 @@ from ..utils import (
determine_ext, determine_ext,
ExtractorError, ExtractorError,
) )
from ..jsinterp import (
JSInterpreter,
_NAME_RE
)
class OpenloadIE(InfoExtractor): class OpenloadIE(InfoExtractor):
@ -56,6 +62,44 @@ class OpenloadIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
def openload_decode(self, txt):
symbol_dict = {
'(゚Д゚) [゚Θ゚]': '_',
'(゚Д゚) [゚ω゚ノ]': 'a',
'(゚Д゚) [゚Θ゚ノ]': 'b',
'(゚Д゚) [\'c\']': 'c',
'(゚Д゚) [゚ー゚ノ]': 'd',
'(゚Д゚) [゚Д゚ノ]': 'e',
'(゚Д゚) [1]': 'f',
'(゚Д゚) [\'o\']': 'o',
'(o゚ー゚o)': 'u',
'(゚Д゚) [\'c\']': 'c',
'((゚ー゚) + (o^_^o))': '7',
'((o^_^o) +(o^_^o) +(c^_^o))': '6',
'((゚ー゚) + (゚Θ゚))': '5',
'(-~3)': '4',
'(-~-~1)': '3',
'(-~1)': '2',
'(-~0)': '1',
'((c^_^o)-(c^_^o))': '0',
}
delim = '(゚Д゚)[゚ε゚]+'
end_token = '(゚Д゚)[゚o゚]'
symbols = '|'.join(map(re.escape, symbol_dict.keys()))
txt = re.sub('(%s)\+\s?' % symbols, lambda m: symbol_dict[m.group(1)], txt)
ret = ''
for aacode in re.findall(r'{0}\+\s?{1}(.*?){0}'.format(re.escape(end_token), re.escape(delim)), txt):
for aachar in aacode.split(delim):
if aachar.isdigit():
ret += compat_chr(int(aachar, 8))
else:
m = re.match(r'^u([\da-f]{4})$', aachar)
if m:
ret += compat_chr(int(m.group(1), 16))
else:
self.report_warning("Cannot decode: %s" % aachar)
return ret
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage('https://openload.co/embed/%s/' % video_id, video_id) webpage = self._download_webpage('https://openload.co/embed/%s/' % video_id, video_id)
@ -70,19 +114,26 @@ class OpenloadIE(InfoExtractor):
r'<span[^>]*>([^<]+)</span>\s*<span[^>]*>[^<]+</span>\s*<span[^>]+id="streamurl"', r'<span[^>]*>([^<]+)</span>\s*<span[^>]*>[^<]+</span>\s*<span[^>]+id="streamurl"',
webpage, 'encrypted data') webpage, 'encrypted data')
magic = compat_ord(enc_data[-1]) enc_code = self._html_search_regex(r'<script[^>]+>(゚ω゚[^<]+)</script>',
webpage, 'encrypted code')
js_code = self.openload_decode(enc_code)
jsi = JSInterpreter(js_code)
m_offset_fun = self._search_regex(r'slice\(0\s*-\s*(%s)\(\)' % _NAME_RE, js_code, 'javascript offset function')
m_diff_fun = self._search_regex(r'charCodeAt\(0\)\s*\+\s*(%s)\(\)' % _NAME_RE, js_code, 'javascript diff function')
offset = jsi.call_function(m_offset_fun)
diff = jsi.call_function(m_diff_fun)
video_url_chars = [] video_url_chars = []
for idx, c in enumerate(enc_data): for idx, c in enumerate(enc_data):
j = compat_ord(c) j = compat_ord(c)
if j == magic:
j -= 1
elif j == magic - 1:
j += 1
if j >= 33 and j <= 126: if j >= 33 and j <= 126:
j = ((j + 14) % 94) + 33 j = ((j + 14) % 94) + 33
if idx == len(enc_data) - 1: if idx == len(enc_data) - offset:
j += 3 j += diff
video_url_chars += compat_chr(j) video_url_chars += compat_chr(j)
video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars) video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars)

View File

@ -8,30 +8,31 @@ from ..utils import int_or_none
class PlaysTVIE(InfoExtractor): class PlaysTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?plays\.tv/video/(?P<id>[0-9a-f]{18})' _VALID_URL = r'https?://(?:www\.)?plays\.tv/(?:video|embeds)/(?P<id>[0-9a-f]{18})'
_TEST = { _TESTS = [{
'url': 'http://plays.tv/video/56af17f56c95335490/when-you-outplay-the-azir-wall', 'url': 'https://plays.tv/video/56af17f56c95335490/when-you-outplay-the-azir-wall',
'md5': 'dfeac1198506652b5257a62762cec7bc', 'md5': 'dfeac1198506652b5257a62762cec7bc',
'info_dict': { 'info_dict': {
'id': '56af17f56c95335490', 'id': '56af17f56c95335490',
'ext': 'mp4', 'ext': 'mp4',
'title': 'When you outplay the Azir wall', 'title': 'Bjergsen - When you outplay the Azir wall',
'description': 'Posted by Bjergsen', 'description': 'Posted by Bjergsen',
} }
} }, {
'url': 'https://plays.tv/embeds/56af17f56c95335490',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(
'https://plays.tv/video/%s' % video_id, video_id)
info = self._search_json_ld(webpage, video_id,)
title = self._og_search_title(webpage)
content = self._parse_json(
self._search_regex(
r'R\.bindContent\(({.+?})\);', webpage,
'content'), video_id)['content']
mpd_url, sources = re.search( mpd_url, sources = re.search(
r'(?s)<video[^>]+data-mpd="([^"]+)"[^>]*>(.+?)</video>', r'(?s)<video[^>]+data-mpd="([^"]+)"[^>]*>(.+?)</video>',
content).groups() webpage).groups()
formats = self._extract_mpd_formats( formats = self._extract_mpd_formats(
self._proto_relative_url(mpd_url), video_id, mpd_id='DASH') self._proto_relative_url(mpd_url), video_id, mpd_id='DASH')
for format_id, height, format_url in re.findall(r'<source\s+res="((\d+)h?)"\s+src="([^"]+)"', sources): for format_id, height, format_url in re.findall(r'<source\s+res="((\d+)h?)"\s+src="([^"]+)"', sources):
@ -42,10 +43,11 @@ class PlaysTVIE(InfoExtractor):
}) })
self._sort_formats(formats) self._sort_formats(formats)
return { info.update({
'id': video_id, 'id': video_id,
'title': title,
'description': self._og_search_description(webpage), 'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': info.get('thumbnail') or self._og_search_thumbnail(webpage),
'formats': formats, 'formats': formats,
} })
return info

View File

@ -10,7 +10,7 @@ from ..utils import (
class Puls4IE(ProSiebenSat1BaseIE): class Puls4IE(ProSiebenSat1BaseIE):
_VALID_URL = r'https?://(?:www\.)?puls4\.com/(?P<id>(?:[^/]+/)*?videos/[^?#]+)' _VALID_URL = r'https?://(?:www\.)?puls4\.com/(?P<id>[^?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.puls4.com/2-minuten-2-millionen/staffel-3/videos/2min2miotalk/Tobias-Homberger-von-myclubs-im-2min2miotalk-118118', 'url': 'http://www.puls4.com/2-minuten-2-millionen/staffel-3/videos/2min2miotalk/Tobias-Homberger-von-myclubs-im-2min2miotalk-118118',
'md5': 'fd3c6b0903ac72c9d004f04bc6bb3e03', 'md5': 'fd3c6b0903ac72c9d004f04bc6bb3e03',
@ -22,6 +22,12 @@ class Puls4IE(ProSiebenSat1BaseIE):
'upload_date': '20160830', 'upload_date': '20160830',
'uploader': 'PULS_4', 'uploader': 'PULS_4',
}, },
}, {
'url': 'http://www.puls4.com/pro-und-contra/wer-wird-prasident/Ganze-Folgen/Wer-wird-Praesident.-Norbert-Hofer',
'only_matching': True,
}, {
'url': 'http://www.puls4.com/pro-und-contra/wer-wird-prasident/Ganze-Folgen/Wer-wird-Praesident-Analyse-des-Interviews-mit-Norbert-Hofer-416598',
'only_matching': True,
}] }]
_TOKEN = 'puls4' _TOKEN = 'puls4'
_SALT = '01!kaNgaiNgah1Ie4AeSha' _SALT = '01!kaNgaiNgah1Ie4AeSha'

View File

@ -5,6 +5,7 @@ from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse from ..compat import compat_urllib_parse_urlparse
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
ExtractorError,
int_or_none, int_or_none,
xpath_attr, xpath_attr,
xpath_text, xpath_text,
@ -101,6 +102,11 @@ class RuutuIE(InfoExtractor):
}) })
extract_formats(video_xml.find('./Clip')) extract_formats(video_xml.find('./Clip'))
drm = xpath_text(video_xml, './Clip/DRM', default=None)
if not formats and drm:
raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats) self._sort_formats(formats)
return { return {

View File

@ -1,146 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_strdate,
js_to_json,
)
class ScreenwaveMediaIE(InfoExtractor):
_VALID_URL = r'(?:https?:)?//player\d?\.screenwavemedia\.com/(?:play/)?[a-zA-Z]+\.php\?.*\bid=(?P<id>[A-Za-z0-9-]+)'
EMBED_PATTERN = r'src=(["\'])(?P<url>(?:https?:)?//player\d?\.screenwavemedia\.com/(?:play/)?[a-zA-Z]+\.php\?.*\bid=.+?)\1'
_TESTS = [{
'url': 'http://player.screenwavemedia.com/play/play.php?playerdiv=videoarea&companiondiv=squareAd&id=Cinemassacre-19911',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
playerdata = self._download_webpage(
'http://player.screenwavemedia.com/player.php?id=%s' % video_id,
video_id, 'Downloading player webpage')
vidtitle = self._search_regex(
r'\'vidtitle\'\s*:\s*"([^"]+)"', playerdata, 'vidtitle').replace('\\/', '/')
playerconfig = self._download_webpage(
'http://player.screenwavemedia.com/player.js',
video_id, 'Downloading playerconfig webpage')
videoserver = self._search_regex(r'SWMServer\s*=\s*"([\d\.]+)"', playerdata, 'videoserver')
sources = self._parse_json(
js_to_json(
re.sub(
r'(?s)/\*.*?\*/', '',
self._search_regex(
r'sources\s*:\s*(\[[^\]]+?\])', playerconfig,
'sources',
).replace(
"' + thisObj.options.videoserver + '",
videoserver
).replace(
"' + playerVidId + '",
video_id
)
)
),
video_id, fatal=False
)
# Fallback to hardcoded sources if JS changes again
if not sources:
self.report_warning('Falling back to a hardcoded list of streams')
sources = [{
'file': 'http://%s/vod/%s_%s.mp4' % (videoserver, video_id, format_id),
'type': 'mp4',
'label': format_label,
} for format_id, format_label in (
('low', '144p Low'), ('med', '160p Med'), ('high', '360p High'), ('hd1', '720p HD1'))]
sources.append({
'file': 'http://%s/vod/smil:%s.smil/playlist.m3u8' % (videoserver, video_id),
'type': 'hls',
})
formats = []
for source in sources:
file_ = source.get('file')
if not file_:
continue
if source.get('type') == 'hls':
formats.extend(self._extract_m3u8_formats(file_, video_id, ext='mp4'))
else:
format_id = self._search_regex(
r'_(.+?)\.[^.]+$', file_, 'format id', default=None)
if not self._is_valid_url(file_, video_id, format_id or 'video'):
continue
format_label = source.get('label')
height = int_or_none(self._search_regex(
r'^(\d+)[pP]', format_label, 'height', default=None))
formats.append({
'url': file_,
'format_id': format_id,
'format': format_label,
'ext': source.get('type'),
'height': height,
})
self._sort_formats(formats, field_preference=('height', 'width', 'tbr', 'format_id'))
return {
'id': video_id,
'title': vidtitle,
'formats': formats,
}
class TeamFourIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?teamfourstar\.com/video/(?P<id>[a-z0-9\-]+)/?'
_TEST = {
'url': 'http://teamfourstar.com/video/a-moment-with-tfs-episode-4/',
'info_dict': {
'id': 'TeamFourStar-5292a02f20bfa',
'ext': 'mp4',
'upload_date': '20130401',
'description': 'Check out this and more on our website: http://teamfourstar.com\nTFS Store: http://sharkrobot.com/team-four-star\nFollow on Twitter: http://twitter.com/teamfourstar\nLike on FB: http://facebook.com/teamfourstar',
'title': 'A Moment With TFS Episode 4',
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
playerdata_url = self._search_regex(
r'src="(http://player\d?\.screenwavemedia\.com/(?:play/)?[a-zA-Z]+\.php\?[^"]*\bid=.+?)"',
webpage, 'player data URL')
video_title = self._html_search_regex(
r'<div class="heroheadingtitle">(?P<title>.+?)</div>',
webpage, 'title')
video_date = unified_strdate(self._html_search_regex(
r'<div class="heroheadingdate">(?P<date>.+?)</div>',
webpage, 'date', fatal=False))
video_description = self._html_search_regex(
r'(?s)<div class="postcontent">(?P<description>.+?)</div>',
webpage, 'description', fatal=False)
video_thumbnail = self._og_search_thumbnail(webpage)
return {
'_type': 'url_transparent',
'display_id': display_id,
'title': video_title,
'description': video_description,
'upload_date': video_date,
'thumbnail': video_thumbnail,
'url': playerdata_url,
}

View File

@ -121,7 +121,7 @@ class SoundcloudIE(InfoExtractor):
}, },
] ]
_CLIENT_ID = '02gUJC0hH2ct1EGOcYXQIzRFU91c72Ea' _CLIENT_ID = 'fDoItMDbsbZz8dY16ZzARCZmzgHBPotA'
_IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf' _IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf'
@staticmethod @staticmethod

View File

@ -1,5 +1,7 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .mtv import MTVServicesInfoExtractor from .mtv import MTVServicesInfoExtractor
@ -16,6 +18,15 @@ class SpikeIE(MTVServicesInfoExtractor):
'timestamp': 1388120400, 'timestamp': 1388120400,
'upload_date': '20131227', 'upload_date': '20131227',
}, },
}, {
'url': 'http://www.spike.com/full-episodes/j830qm/lip-sync-battle-joel-mchale-vs-jim-rash-season-2-ep-209',
'md5': 'b25c6f16418aefb9ad5a6cae2559321f',
'info_dict': {
'id': '37ace3a8-1df6-48be-85b8-38df8229e241',
'ext': 'mp4',
'title': 'Lip Sync Battle|April 28, 2016|2|209|Joel McHale Vs. Jim Rash|Act 1',
'description': 'md5:a739ca8f978a7802f67f8016d27ce114',
},
}, { }, {
'url': 'http://www.spike.com/video-clips/lhtu8m/', 'url': 'http://www.spike.com/video-clips/lhtu8m/',
'only_matching': True, 'only_matching': True,
@ -32,3 +43,12 @@ class SpikeIE(MTVServicesInfoExtractor):
_FEED_URL = 'http://www.spike.com/feeds/mrss/' _FEED_URL = 'http://www.spike.com/feeds/mrss/'
_MOBILE_TEMPLATE = 'http://m.spike.com/videos/video.rbml?id=%s' _MOBILE_TEMPLATE = 'http://m.spike.com/videos/video.rbml?id=%s'
_CUSTOM_URL_REGEX = re.compile(r'spikenetworkapp://([^/]+/[-a-fA-F0-9]+)')
def _extract_mgid(self, webpage):
mgid = super(SpikeIE, self)._extract_mgid(webpage, default=None)
if mgid is None:
url_parts = self._search_regex(self._CUSTOM_URL_REGEX, webpage, 'episode_id')
video_type, episode_id = url_parts.split('/', 1)
mgid = 'mgid:arc:{0}:spike.com:{1}'.format(video_type, episode_id)
return mgid

View File

@ -0,0 +1,48 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from .jwplatform import JWPlatformIE
from ..utils import unified_strdate
class TeamFourStarIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?teamfourstar\.com/(?P<id>[a-z0-9\-]+)'
_TEST = {
'url': 'http://teamfourstar.com/tfs-abridged-parody-episode-1-2/',
'info_dict': {
'id': '0WdZO31W',
'title': 'TFS Abridged Parody Episode 1',
'description': 'md5:d60bc389588ebab2ee7ad432bda953ae',
'ext': 'mp4',
'timestamp': 1394168400,
'upload_date': '20080508',
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
jwplatform_url = JWPlatformIE._extract_url(webpage)
video_title = self._html_search_regex(
r'<h1[^>]+class="entry-title"[^>]*>(?P<title>.+?)</h1>',
webpage, 'title')
video_date = unified_strdate(self._html_search_regex(
r'<span[^>]+class="meta-date date updated"[^>]*>(?P<date>.+?)</span>',
webpage, 'date', fatal=False))
video_description = self._html_search_regex(
r'(?s)<div[^>]+class="content-inner"[^>]*>.*?(?P<description><p>.+?)</div>',
webpage, 'description', fatal=False)
video_thumbnail = self._og_search_thumbnail(webpage)
return {
'_type': 'url_transparent',
'display_id': display_id,
'title': video_title,
'description': video_description,
'upload_date': video_date,
'thumbnail': video_thumbnail,
'url': jwplatform_url,
}

View File

@ -7,33 +7,30 @@ from .common import InfoExtractor
class TeleBruxellesIE(InfoExtractor): class TeleBruxellesIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(news|sport|dernier-jt)/?(?P<id>[^/#?]+)' _VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(news|sport|dernier-jt|emission)/?(?P<id>[^/#?]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.telebruxelles.be/news/auditions-devant-parlement-francken-galant-tres-attendus/', 'url': 'http://bx1.be/news/que-risque-lauteur-dune-fausse-alerte-a-la-bombe/',
'md5': '59439e568c9ee42fb77588b2096b214f', 'md5': 'a2a67a5b1c3e8c9d33109b902f474fd9',
'info_dict': { 'info_dict': {
'id': '11942', 'id': '158856',
'display_id': 'auditions-devant-parlement-francken-galant-tres-attendus', 'display_id': 'que-risque-lauteur-dune-fausse-alerte-a-la-bombe',
'ext': 'flv', 'ext': 'mp4',
'title': 'Parlement : Francken et Galant répondent aux interpellations de lopposition', 'title': 'Que risque lauteur dune fausse alerte à la bombe ?',
'description': 're:Les auditions des ministres se poursuivent*' 'description': 'md5:3cf8df235d44ebc5426373050840e466',
},
'params': {
'skip_download': 'requires rtmpdump'
}, },
}, { }, {
'url': 'http://www.telebruxelles.be/sport/basket-brussels-bat-mons-80-74/', 'url': 'http://bx1.be/sport/futsal-schaerbeek-sincline-5-3-a-thulin/',
'md5': '181d3fbdcf20b909309e5aef5c6c6047', 'md5': 'dfe07ecc9c153ceba8582ac912687675',
'info_dict': { 'info_dict': {
'id': '10091', 'id': '158433',
'display_id': 'basket-brussels-bat-mons-80-74', 'display_id': 'futsal-schaerbeek-sincline-5-3-a-thulin',
'ext': 'flv', 'ext': 'mp4',
'title': 'Basket : le Brussels bat Mons 80-74', 'title': 'Futsal : Schaerbeek sincline 5-3 à Thulin',
'description': 're:^Ils l\u2019on fait ! En basket, le B*', 'description': 'md5:fd013f1488d5e2dceb9cebe39e2d569b',
},
'params': {
'skip_download': 'requires rtmpdump'
}, },
}, {
'url': 'http://bx1.be/emission/bxenf1-gastronomie/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@ -50,13 +47,13 @@ class TeleBruxellesIE(InfoExtractor):
r'file\s*:\s*"(rtmp://[^/]+/vod/mp4:"\s*\+\s*"[^"]+"\s*\+\s*".mp4)"', r'file\s*:\s*"(rtmp://[^/]+/vod/mp4:"\s*\+\s*"[^"]+"\s*\+\s*".mp4)"',
webpage, 'RTMP url') webpage, 'RTMP url')
rtmp_url = re.sub(r'"\s*\+\s*"', '', rtmp_url) rtmp_url = re.sub(r'"\s*\+\s*"', '', rtmp_url)
formats = self._extract_wowza_formats(rtmp_url, article_id or display_id)
self._sort_formats(formats)
return { return {
'id': article_id or display_id, 'id': article_id or display_id,
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'description': description, 'description': description,
'url': rtmp_url, 'formats': formats,
'ext': 'flv',
'rtmp_live': True # if rtmpdump is not called with "--live" argument, the download is blocked and can be completed
} }

View File

@ -5,10 +5,10 @@ from .common import InfoExtractor
class ThisOldHouseIE(InfoExtractor): class ThisOldHouseIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to)/(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to|tv-episode)/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.thisoldhouse.com/how-to/how-to-build-storage-bench', 'url': 'https://www.thisoldhouse.com/how-to/how-to-build-storage-bench',
'md5': '568acf9ca25a639f0c4ff905826b662f', 'md5': '946f05bbaa12a33f9ae35580d2dfcfe3',
'info_dict': { 'info_dict': {
'id': '2REGtUDQ', 'id': '2REGtUDQ',
'ext': 'mp4', 'ext': 'mp4',
@ -20,6 +20,9 @@ class ThisOldHouseIE(InfoExtractor):
}, { }, {
'url': 'https://www.thisoldhouse.com/watch/arlington-arts-crafts-arts-and-crafts-class-begins', 'url': 'https://www.thisoldhouse.com/watch/arlington-arts-crafts-arts-and-crafts-class-begins',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.thisoldhouse.com/tv-episode/ask-toh-shelf-rough-electric',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -56,7 +56,7 @@ class TouTvIE(InfoExtractor):
'state': state, 'state': state,
}) })
login_form = self._search_regex( login_form = self._search_regex(
r'(?s)(<form[^>]+id="Form-login".+?</form>)', login_webpage, 'login form') r'(?s)(<form[^>]+(?:id|name)="Form-login".+?</form>)', login_webpage, 'login form')
form_data = self._hidden_inputs(login_form) form_data = self._hidden_inputs(login_form)
form_data.update({ form_data.update({
'login-email': email, 'login-email': email,

View File

@ -0,0 +1,65 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .brightcove import BrightcoveNewIE
class TVANouvellesIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tvanouvelles\.ca/videos/(?P<id>\d+)'
_TEST = {
'url': 'http://www.tvanouvelles.ca/videos/5117035533001',
'info_dict': {
'id': '5117035533001',
'ext': 'mp4',
'title': 'Lindustrie du taxi dénonce lentente entre Québec et Uber: explications',
'description': 'md5:479653b7c8cf115747bf5118066bd8b3',
'uploader_id': '1741764581',
'timestamp': 1473352030,
'upload_date': '20160908',
},
'add_ie': ['BrightcoveNew'],
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1741764581/default_default/index.html?videoId=%s'
def _real_extract(self, url):
brightcove_id = self._match_id(url)
return self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
BrightcoveNewIE.ie_key(), brightcove_id)
class TVANouvellesArticleIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tvanouvelles\.ca/(?:[^/]+/)+(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://www.tvanouvelles.ca/2016/11/17/des-policiers-qui-ont-la-meche-un-peu-courte',
'info_dict': {
'id': 'des-policiers-qui-ont-la-meche-un-peu-courte',
'title': 'Des policiers qui ont «la mèche un peu courte»?',
'description': 'md5:92d363c8eb0f0f030de9a4a84a90a3a0',
},
'playlist_mincount': 4,
}
@classmethod
def suitable(cls, url):
return False if TVANouvellesIE.suitable(url) else super(TVANouvellesArticleIE, cls).suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
entries = [
self.url_result(
'http://www.tvanouvelles.ca/videos/%s' % mobj.group('id'),
ie=TVANouvellesIE.ie_key(), video_id=mobj.group('id'))
for mobj in re.finditer(
r'data-video-id=(["\'])?(?P<id>\d+)', webpage)]
title = self._og_search_title(webpage, fatal=False)
description = self._og_search_description(webpage)
return self.playlist_result(entries, display_id, title, description)

View File

@ -25,7 +25,7 @@ class TwitterBaseIE(InfoExtractor):
class TwitterCardIE(TwitterBaseIE): class TwitterCardIE(TwitterBaseIE):
IE_NAME = 'twitter:card' IE_NAME = 'twitter:card'
_VALID_URL = r'https?://(?:www\.)?twitter\.com/i/(?:cards/tfw/v1|videos/tweet)/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?twitter\.com/i/(?:cards/tfw/v1|videos(?:/tweet)?)/(?P<id>\d+)'
_TESTS = [ _TESTS = [
{ {
'url': 'https://twitter.com/i/cards/tfw/v1/560070183650213889', 'url': 'https://twitter.com/i/cards/tfw/v1/560070183650213889',
@ -84,6 +84,9 @@ class TwitterCardIE(TwitterBaseIE):
'title': 'Twitter web player', 'title': 'Twitter web player',
'thumbnail': 're:^https?://.*\.jpg', 'thumbnail': 're:^https?://.*\.jpg',
}, },
}, {
'url': 'https://twitter.com/i/videos/752274308186120192',
'only_matching': True,
}, },
] ]

View File

@ -51,7 +51,7 @@ class VevoIE(VevoBaseIE):
'artist': 'Hurts', 'artist': 'Hurts',
'genre': 'Pop', 'genre': 'Pop',
}, },
'expected_warnings': ['Unable to download SMIL file'], 'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, { }, {
'note': 'v3 SMIL format', 'note': 'v3 SMIL format',
'url': 'http://www.vevo.com/watch/cassadee-pope/i-wish-i-could-break-your-heart/USUV71302923', 'url': 'http://www.vevo.com/watch/cassadee-pope/i-wish-i-could-break-your-heart/USUV71302923',
@ -67,7 +67,7 @@ class VevoIE(VevoBaseIE):
'artist': 'Cassadee Pope', 'artist': 'Cassadee Pope',
'genre': 'Country', 'genre': 'Country',
}, },
'expected_warnings': ['Unable to download SMIL file'], 'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, { }, {
'note': 'Age-limited video', 'note': 'Age-limited video',
'url': 'https://www.vevo.com/watch/justin-timberlake/tunnel-vision-explicit/USRV81300282', 'url': 'https://www.vevo.com/watch/justin-timberlake/tunnel-vision-explicit/USRV81300282',
@ -83,7 +83,7 @@ class VevoIE(VevoBaseIE):
'artist': 'Justin Timberlake', 'artist': 'Justin Timberlake',
'genre': 'Pop', 'genre': 'Pop',
}, },
'expected_warnings': ['Unable to download SMIL file'], 'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, { }, {
'note': 'No video_info', 'note': 'No video_info',
'url': 'http://www.vevo.com/watch/k-camp-1/Till-I-Die/USUV71503000', 'url': 'http://www.vevo.com/watch/k-camp-1/Till-I-Die/USUV71503000',
@ -91,15 +91,33 @@ class VevoIE(VevoBaseIE):
'info_dict': { 'info_dict': {
'id': 'USUV71503000', 'id': 'USUV71503000',
'ext': 'mp4', 'ext': 'mp4',
'title': 'K Camp - Till I Die', 'title': 'K Camp ft. T.I. - Till I Die',
'age_limit': 18, 'age_limit': 18,
'timestamp': 1449468000, 'timestamp': 1449468000,
'upload_date': '20151207', 'upload_date': '20151207',
'uploader': 'K Camp', 'uploader': 'K Camp',
'track': 'Till I Die', 'track': 'Till I Die',
'artist': 'K Camp', 'artist': 'K Camp',
'genre': 'Rap/Hip-Hop', 'genre': 'Hip-Hop',
}, },
'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, {
'note': 'Featured test',
'url': 'https://www.vevo.com/watch/lemaitre/Wait/USUV71402190',
'md5': 'd28675e5e8805035d949dc5cf161071d',
'info_dict': {
'id': 'USUV71402190',
'ext': 'mp4',
'title': 'Lemaitre ft. LoLo - Wait',
'age_limit': 0,
'timestamp': 1413432000,
'upload_date': '20141016',
'uploader': 'Lemaitre',
'track': 'Wait',
'artist': 'Lemaitre',
'genre': 'Electronic',
},
'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, { }, {
'note': 'Only available via webpage', 'note': 'Only available via webpage',
'url': 'http://www.vevo.com/watch/GBUV71600656', 'url': 'http://www.vevo.com/watch/GBUV71600656',
@ -242,8 +260,11 @@ class VevoIE(VevoBaseIE):
timestamp = parse_iso8601(video_info.get('releaseDate')) timestamp = parse_iso8601(video_info.get('releaseDate'))
artists = video_info.get('artists') artists = video_info.get('artists')
if artists: for curr_artist in artists:
artist = uploader = artists[0]['name'] if curr_artist.get('role') == 'Featured':
featured_artist = curr_artist['name']
else:
artist = uploader = curr_artist['name']
view_count = int_or_none(video_info.get('views', {}).get('total')) view_count = int_or_none(video_info.get('views', {}).get('total'))
for video_version in video_versions: for video_version in video_versions:

View File

@ -1,11 +1,12 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import time
import hmac
import hashlib import hashlib
import hmac
import itertools import itertools
import json
import re
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
@ -276,10 +277,14 @@ class VikiIE(VikiBaseIE):
height = int_or_none(self._search_regex( height = int_or_none(self._search_regex(
r'^(\d+)[pP]$', format_id, 'height', default=None)) r'^(\d+)[pP]$', format_id, 'height', default=None))
for protocol, format_dict in stream_dict.items(): for protocol, format_dict in stream_dict.items():
# rtmps URLs does not seem to work
if protocol == 'rtmps':
continue
format_url = format_dict['url']
if format_id == 'm3u8': if format_id == 'm3u8':
m3u8_formats = self._extract_m3u8_formats( m3u8_formats = self._extract_m3u8_formats(
format_dict['url'], video_id, 'mp4', format_url, video_id, 'mp4',
entry_protocol='m3u8_native', preference=-1, entry_protocol='m3u8_native',
m3u8_id='m3u8-%s' % protocol, fatal=False) m3u8_id='m3u8-%s' % protocol, fatal=False)
# Despite CODECS metadata in m3u8 all video-only formats # Despite CODECS metadata in m3u8 all video-only formats
# are actually video+audio # are actually video+audio
@ -287,9 +292,23 @@ class VikiIE(VikiBaseIE):
if f.get('acodec') == 'none' and f.get('vcodec') != 'none': if f.get('acodec') == 'none' and f.get('vcodec') != 'none':
f['acodec'] = None f['acodec'] = None
formats.extend(m3u8_formats) formats.extend(m3u8_formats)
elif format_url.startswith('rtmp'):
mobj = re.search(
r'^(?P<url>rtmp://[^/]+/(?P<app>.+?))/(?P<playpath>mp4:.+)$',
format_url)
if not mobj:
continue
formats.append({
'format_id': 'rtmp-%s' % format_id,
'ext': 'flv',
'url': mobj.group('url'),
'play_path': mobj.group('playpath'),
'app': mobj.group('app'),
'page_url': url,
})
else: else:
formats.append({ formats.append({
'url': format_dict['url'], 'url': format_url,
'format_id': '%s-%s' % (format_id, protocol), 'format_id': '%s-%s' % (format_id, protocol),
'height': height, 'height': height,
}) })

View File

@ -17,7 +17,7 @@ from ..compat import compat_urllib_parse_urlencode
class VLiveIE(InfoExtractor): class VLiveIE(InfoExtractor):
IE_NAME = 'vlive' IE_NAME = 'vlive'
_VALID_URL = r'https?://(?:(?:www|m)\.)?vlive\.tv/video/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:(?:www|m)\.)?vlive\.tv/video/(?P<id>[0-9]+)'
_TEST = { _TESTS = [{
'url': 'http://www.vlive.tv/video/1326', 'url': 'http://www.vlive.tv/video/1326',
'md5': 'cc7314812855ce56de70a06a27314983', 'md5': 'cc7314812855ce56de70a06a27314983',
'info_dict': { 'info_dict': {
@ -27,7 +27,20 @@ class VLiveIE(InfoExtractor):
'creator': "Girl's Day", 'creator': "Girl's Day",
'view_count': int, 'view_count': int,
}, },
} }, {
'url': 'http://www.vlive.tv/video/16937',
'info_dict': {
'id': '16937',
'ext': 'mp4',
'title': '[V LIVE] 첸백시 걍방',
'creator': 'EXO',
'view_count': int,
'subtitles': 'mincount:12',
},
'params': {
'skip_download': True,
},
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
@ -116,7 +129,7 @@ class VLiveIE(InfoExtractor):
subtitles = {} subtitles = {}
for caption in playinfo.get('captions', {}).get('list', []): for caption in playinfo.get('captions', {}).get('list', []):
lang = dict_get(caption, ('language', 'locale', 'country', 'label')) lang = dict_get(caption, ('locale', 'language', 'country', 'label'))
if lang and caption.get('source'): if lang and caption.get('source'):
subtitles[lang] = [{ subtitles[lang] = [{
'ext': 'vtt', 'ext': 'vtt',

View File

@ -0,0 +1,102 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
xpath_text,
)
class WebcasterIE(InfoExtractor):
_VALID_URL = r'https?://bl\.webcaster\.pro/(?:quote|media)/start/free_(?P<id>[^/]+)'
_TESTS = [{
# http://video.khl.ru/quotes/393859
'url': 'http://bl.webcaster.pro/quote/start/free_c8cefd240aa593681c8d068cff59f407_hd/q393859/eb173f99dd5f558674dae55f4ba6806d/1480289104?sr%3D105%26fa%3D1%26type_id%3D18',
'md5': '0c162f67443f30916ff1c89425dcd4cd',
'info_dict': {
'id': 'c8cefd240aa593681c8d068cff59f407_hd',
'ext': 'mp4',
'title': 'Сибирь - Нефтехимик. Лучшие моменты первого периода',
'thumbnail': 're:^https?://.*\.jpg$',
},
}, {
'url': 'http://bl.webcaster.pro/media/start/free_6246c7a4453ac4c42b4398f840d13100_hd/2_2991109016/e8d0d82587ef435480118f9f9c41db41/4635726126',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_xml(url, video_id)
title = xpath_text(video, './/event_name', 'event name', fatal=True)
def make_id(parts, separator):
return separator.join(filter(None, parts))
formats = []
for format_id in (None, 'noise'):
track_tag = make_id(('track', format_id), '_')
for track in video.findall('.//iphone/%s' % track_tag):
track_url = track.text
if not track_url:
continue
if determine_ext(track_url) == 'm3u8':
m3u8_formats = self._extract_m3u8_formats(
track_url, video_id, 'mp4',
entry_protocol='m3u8_native',
m3u8_id=make_id(('hls', format_id), '-'), fatal=False)
for f in m3u8_formats:
f.update({
'source_preference': 0 if format_id == 'noise' else 1,
'format_note': track.get('title'),
})
formats.extend(m3u8_formats)
self._sort_formats(formats)
thumbnail = xpath_text(video, './/image', 'thumbnail')
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}
class WebcasterFeedIE(InfoExtractor):
_VALID_URL = r'https?://bl\.webcaster\.pro/feed/start/free_(?P<id>[^/]+)'
_TEST = {
'url': 'http://bl.webcaster.pro/feed/start/free_c8cefd240aa593681c8d068cff59f407_hd/q393859/eb173f99dd5f558674dae55f4ba6806d/1480289104',
'only_matching': True,
}
@staticmethod
def _extract_url(ie, webpage):
mobj = re.search(
r'<(?:object|a[^>]+class=["\']webcaster-player["\'])[^>]+data(?:-config)?=(["\']).*?config=(?P<url>https?://bl\.webcaster\.pro/feed/start/free_.*?)(?:[?&]|\1)',
webpage)
if mobj:
return mobj.group('url')
for secure in (True, False):
video_url = ie._og_search_video_url(
webpage, secure=secure, default=None)
if video_url:
mobj = re.search(
r'config=(?P<url>https?://bl\.webcaster\.pro/feed/start/free_[^?&=]+)',
video_url)
if mobj:
return mobj.group('url')
def _real_extract(self, url):
video_id = self._match_id(url)
feed = self._download_xml(url, video_id)
video_url = xpath_text(
feed, ('video_hd', 'video'), 'video url', fatal=True)
return self.url_result(video_url, WebcasterIE.ie_key())

View File

@ -1796,7 +1796,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
| |
((?:PL|LL|EC|UU|FL|RD|UL)[0-9A-Za-z-_]{10,}) ((?:PL|LL|EC|UU|FL|RD|UL)[0-9A-Za-z-_]{10,})
)""" )"""
_TEMPLATE_URL = 'https://www.youtube.com/playlist?list=%s' _TEMPLATE_URL = 'https://www.youtube.com/playlist?list=%s&disable_polymer=true'
_VIDEO_RE = r'href="\s*/watch\?v=(?P<id>[0-9A-Za-z_-]{11})&amp;[^"]*?index=(?P<index>\d+)(?:[^>]+>(?P<title>[^<]+))?' _VIDEO_RE = r'href="\s*/watch\?v=(?P<id>[0-9A-Za-z_-]{11})&amp;[^"]*?index=(?P<index>\d+)(?:[^>]+>(?P<title>[^<]+))?'
IE_NAME = 'youtube:playlist' IE_NAME = 'youtube:playlist'
_TESTS = [{ _TESTS = [{
@ -2175,7 +2175,7 @@ class YoutubeUserIE(YoutubeChannelIE):
class YoutubeLiveIE(YoutubeBaseInfoExtractor): class YoutubeLiveIE(YoutubeBaseInfoExtractor):
IE_DESC = 'YouTube.com live streams' IE_DESC = 'YouTube.com live streams'
_VALID_URL = r'(?P<base_url>https?://(?:\w+\.)?youtube\.com/(?:user|channel|c)/(?P<id>[^/]+))/live' _VALID_URL = r'(?P<base_url>https?://(?:\w+\.)?youtube\.com/(?:(?:user|channel|c)/)?(?P<id>[^/]+))/live'
IE_NAME = 'youtube:live' IE_NAME = 'youtube:live'
_TESTS = [{ _TESTS = [{
@ -2204,6 +2204,9 @@ class YoutubeLiveIE(YoutubeBaseInfoExtractor):
}, { }, {
'url': 'https://www.youtube.com/c/CommanderVideoHq/live', 'url': 'https://www.youtube.com/c/CommanderVideoHq/live',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.youtube.com/TheYoungTurks/live',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -198,12 +198,12 @@ class JSInterpreter(object):
return opfunc(x, y) return opfunc(x, y)
m = re.match( m = re.match(
r'^(?P<func>%s)\((?P<args>[a-zA-Z0-9_$,]+)\)$' % _NAME_RE, expr) r'^(?P<func>%s)\((?P<args>[a-zA-Z0-9_$,]*)\)$' % _NAME_RE, expr)
if m: if m:
fname = m.group('func') fname = m.group('func')
argvals = tuple([ argvals = tuple([
int(v) if v.isdigit() else local_vars[v] int(v) if v.isdigit() else local_vars[v]
for v in m.group('args').split(',')]) for v in m.group('args').split(',')]) if len(m.group('args')) > 0 else tuple()
if fname not in self._functions: if fname not in self._functions:
self._functions[fname] = self.extract_function(fname) self._functions[fname] = self.extract_function(fname)
return self._functions[fname](argvals) return self._functions[fname](argvals)

View File

@ -55,12 +55,12 @@ class Socks5AddressType(object):
ATYP_IPV6 = 0x04 ATYP_IPV6 = 0x04
class ProxyError(IOError): class ProxyError(socket.error):
ERR_SUCCESS = 0x00 ERR_SUCCESS = 0x00
def __init__(self, code=None, msg=None): def __init__(self, code=None, msg=None):
if code is not None and msg is None: if code is not None and msg is None:
msg = self.CODES.get(code) and 'unknown error' msg = self.CODES.get(code) or 'unknown error'
super(ProxyError, self).__init__(code, msg) super(ProxyError, self).__init__(code, msg)
@ -103,6 +103,7 @@ class ProxyType(object):
SOCKS4A = 1 SOCKS4A = 1
SOCKS5 = 2 SOCKS5 = 2
Proxy = collections.namedtuple('Proxy', ( Proxy = collections.namedtuple('Proxy', (
'type', 'host', 'port', 'username', 'password', 'remote_dns')) 'type', 'host', 'port', 'username', 'password', 'remote_dns'))
@ -122,7 +123,7 @@ class sockssocket(socket.socket):
while len(data) < cnt: while len(data) < cnt:
cur = self.recv(cnt - len(data)) cur = self.recv(cnt - len(data))
if not cur: if not cur:
raise IOError('{0} bytes missing'.format(cnt - len(data))) raise EOFError('{0} bytes missing'.format(cnt - len(data)))
data += cur data += cur
return data return data

View File

@ -115,6 +115,8 @@ def _u30(reader):
res = _read_int(reader) res = _read_int(reader)
assert res & 0xf0000000 == 0 assert res & 0xf0000000 == 0
return res return res
_u32 = _read_int _u32 = _read_int
@ -176,6 +178,7 @@ class _Undefined(object):
return 'undefined' return 'undefined'
__repr__ = __str__ __repr__ = __str__
undefined = _Undefined() undefined = _Undefined()

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2016.11.08.1' __version__ = '2016.12.01'