commit
0106c186ff
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@ -6,8 +6,8 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.01.07*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.01.14*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.01.07**
|
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.01.14**
|
||||||
|
|
||||||
### Before submitting an *issue* make sure you have:
|
### Before submitting an *issue* make sure you have:
|
||||||
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||||
@ -35,7 +35,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
|
|||||||
[debug] User config: []
|
[debug] User config: []
|
||||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||||
[debug] youtube-dl version 2018.01.07
|
[debug] youtube-dl version 2018.01.14
|
||||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||||
[debug] Proxy map: {}
|
[debug] Proxy map: {}
|
||||||
|
1
AUTHORS
1
AUTHORS
@ -232,3 +232,4 @@ Tatsuyuki Ishi
|
|||||||
Daniel Weber
|
Daniel Weber
|
||||||
Kay Bouché
|
Kay Bouché
|
||||||
Yang Hongbo
|
Yang Hongbo
|
||||||
|
Lei Wang
|
||||||
|
18
ChangeLog
18
ChangeLog
@ -1,8 +1,22 @@
|
|||||||
version <unreleased>
|
version 2018.01.14
|
||||||
|
|
||||||
Extractors
|
Extractors
|
||||||
|
* [youtube] Fix live streams extraction (#15202)
|
||||||
|
* [wdr] Bypass geo restriction
|
||||||
|
* [wdr] Rework extractors (#14598)
|
||||||
|
+ [wdr] Add support for wdrmaus.de/elefantenseite (#14598)
|
||||||
|
+ [gamestar] Add support for gamepro.de (#3384)
|
||||||
|
* [viafree] Skip rtmp formats (#15232)
|
||||||
|
+ [pandoratv] Add support for mobile URLs (#12441)
|
||||||
|
+ [pandoratv] Add support for new URL format (#15131)
|
||||||
|
+ [ximalaya] Add support for ximalaya.com (#14687)
|
||||||
|
+ [digg] Add support for digg.com (#15214)
|
||||||
|
* [limelight] Tolerate empty pc formats (#15150, #15151, #15207)
|
||||||
|
* [ndr:embed:base] Make separate formats extraction non fatal (#15203)
|
||||||
+ [weibo] Add extractor (#15079)
|
+ [weibo] Add extractor (#15079)
|
||||||
* [bilibili] fix extraction (#15188)
|
+ [ok] Add support for live streams
|
||||||
|
* [canalplus] Fix extraction (#15072)
|
||||||
|
* [bilibili] Fix extraction (#15188)
|
||||||
|
|
||||||
|
|
||||||
version 2018.01.07
|
version 2018.01.07
|
||||||
|
@ -46,7 +46,7 @@ Or with [MacPorts](https://www.macports.org/):
|
|||||||
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
|
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
|
||||||
|
|
||||||
# DESCRIPTION
|
# DESCRIPTION
|
||||||
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
|
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on macOS. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
|
||||||
|
|
||||||
youtube-dl [OPTIONS] URL [URL...]
|
youtube-dl [OPTIONS] URL [URL...]
|
||||||
|
|
||||||
@ -878,7 +878,7 @@ Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
|||||||
|
|
||||||
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
|
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
|
||||||
|
|
||||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, macOS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||||
|
|
||||||
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
|
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
|
||||||
|
|
||||||
|
@ -128,7 +128,7 @@
|
|||||||
- **CamdemyFolder**
|
- **CamdemyFolder**
|
||||||
- **CamWithHer**
|
- **CamWithHer**
|
||||||
- **canalc2.tv**
|
- **canalc2.tv**
|
||||||
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
|
- **Canalplus**: mycanal.fr and piwiplus.fr
|
||||||
- **Canvas**
|
- **Canvas**
|
||||||
- **CanvasEen**: canvas.be and een.be
|
- **CanvasEen**: canvas.be and een.be
|
||||||
- **CarambaTV**
|
- **CarambaTV**
|
||||||
@ -210,6 +210,7 @@
|
|||||||
- **defense.gouv.fr**
|
- **defense.gouv.fr**
|
||||||
- **democracynow**
|
- **democracynow**
|
||||||
- **DHM**: Filmarchiv - Deutsches Historisches Museum
|
- **DHM**: Filmarchiv - Deutsches Historisches Museum
|
||||||
|
- **Digg**
|
||||||
- **DigitallySpeaking**
|
- **DigitallySpeaking**
|
||||||
- **Digiteka**
|
- **Digiteka**
|
||||||
- **Discovery**
|
- **Discovery**
|
||||||
@ -773,7 +774,6 @@
|
|||||||
- **Sport5**
|
- **Sport5**
|
||||||
- **SportBoxEmbed**
|
- **SportBoxEmbed**
|
||||||
- **SportDeutschland**
|
- **SportDeutschland**
|
||||||
- **Sportschau**
|
|
||||||
- **Sprout**
|
- **Sprout**
|
||||||
- **sr:mediathek**: Saarländischer Rundfunk
|
- **sr:mediathek**: Saarländischer Rundfunk
|
||||||
- **SRGSSR**
|
- **SRGSSR**
|
||||||
@ -1002,10 +1002,14 @@
|
|||||||
- **WatchIndianPorn**: Watch Indian Porn
|
- **WatchIndianPorn**: Watch Indian Porn
|
||||||
- **WDR**
|
- **WDR**
|
||||||
- **wdr:mobile**
|
- **wdr:mobile**
|
||||||
|
- **WDRElefant**
|
||||||
|
- **WDRPage**
|
||||||
- **Webcaster**
|
- **Webcaster**
|
||||||
- **WebcasterFeed**
|
- **WebcasterFeed**
|
||||||
- **WebOfStories**
|
- **WebOfStories**
|
||||||
- **WebOfStoriesPlaylist**
|
- **WebOfStoriesPlaylist**
|
||||||
|
- **Weibo**
|
||||||
|
- **WeiboMobile**
|
||||||
- **WeiqiTV**: WQTV
|
- **WeiqiTV**: WQTV
|
||||||
- **wholecloud**: WholeCloud
|
- **wholecloud**: WholeCloud
|
||||||
- **Wimp**
|
- **Wimp**
|
||||||
@ -1025,6 +1029,8 @@
|
|||||||
- **xiami:artist**: 虾米音乐 - 歌手
|
- **xiami:artist**: 虾米音乐 - 歌手
|
||||||
- **xiami:collection**: 虾米音乐 - 精选集
|
- **xiami:collection**: 虾米音乐 - 精选集
|
||||||
- **xiami:song**: 虾米音乐
|
- **xiami:song**: 虾米音乐
|
||||||
|
- **ximalaya**: 喜马拉雅FM
|
||||||
|
- **ximalaya:album**: 喜马拉雅FM 专辑
|
||||||
- **XMinus**
|
- **XMinus**
|
||||||
- **XNXX**
|
- **XNXX**
|
||||||
- **Xstream**
|
- **Xstream**
|
||||||
|
56
youtube_dl/extractor/digg.py
Normal file
56
youtube_dl/extractor/digg.py
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import js_to_json
|
||||||
|
|
||||||
|
|
||||||
|
class DiggIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?digg\.com/video/(?P<id>[^/?#&]+)'
|
||||||
|
_TESTS = [{
|
||||||
|
# JWPlatform via provider
|
||||||
|
'url': 'http://digg.com/video/sci-fi-short-jonah-daniel-kaluuya-get-out',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'LcqvmS0b',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': "'Get Out' Star Daniel Kaluuya Goes On 'Moby Dick'-Like Journey In Sci-Fi Short 'Jonah'",
|
||||||
|
'description': 'md5:541bb847648b6ee3d6514bc84b82efda',
|
||||||
|
'upload_date': '20180109',
|
||||||
|
'timestamp': 1515530551,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
# Youtube via provider
|
||||||
|
'url': 'http://digg.com/video/dog-boat-seal-play',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
# vimeo as regular embed
|
||||||
|
'url': 'http://digg.com/video/dream-girl-short-film',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
info = self._parse_json(
|
||||||
|
self._search_regex(
|
||||||
|
r'(?s)video_info\s*=\s*({.+?});\n', webpage, 'video info',
|
||||||
|
default='{}'), display_id, transform_source=js_to_json,
|
||||||
|
fatal=False)
|
||||||
|
|
||||||
|
video_id = info.get('video_id')
|
||||||
|
|
||||||
|
if video_id:
|
||||||
|
provider = info.get('provider_name')
|
||||||
|
if provider == 'youtube':
|
||||||
|
return self.url_result(
|
||||||
|
video_id, ie='Youtube', video_id=video_id)
|
||||||
|
elif provider == 'jwplayer':
|
||||||
|
return self.url_result(
|
||||||
|
'jwplatform:%s' % video_id, ie='JWPlatform',
|
||||||
|
video_id=video_id)
|
||||||
|
|
||||||
|
return self.url_result(url, 'Generic')
|
@ -259,6 +259,7 @@ from .deezer import DeezerPlaylistIE
|
|||||||
from .democracynow import DemocracynowIE
|
from .democracynow import DemocracynowIE
|
||||||
from .dfb import DFBIE
|
from .dfb import DFBIE
|
||||||
from .dhm import DHMIE
|
from .dhm import DHMIE
|
||||||
|
from .digg import DiggIE
|
||||||
from .dotsub import DotsubIE
|
from .dotsub import DotsubIE
|
||||||
from .douyutv import (
|
from .douyutv import (
|
||||||
DouyuShowIE,
|
DouyuShowIE,
|
||||||
@ -990,7 +991,6 @@ from .stitcher import StitcherIE
|
|||||||
from .sport5 import Sport5IE
|
from .sport5 import Sport5IE
|
||||||
from .sportbox import SportBoxEmbedIE
|
from .sportbox import SportBoxEmbedIE
|
||||||
from .sportdeutschland import SportDeutschlandIE
|
from .sportdeutschland import SportDeutschlandIE
|
||||||
from .sportschau import SportschauIE
|
|
||||||
from .sprout import SproutIE
|
from .sprout import SproutIE
|
||||||
from .srgssr import (
|
from .srgssr import (
|
||||||
SRGSSRIE,
|
SRGSSRIE,
|
||||||
@ -1288,6 +1288,8 @@ from .watchbox import WatchBoxIE
|
|||||||
from .watchindianporn import WatchIndianPornIE
|
from .watchindianporn import WatchIndianPornIE
|
||||||
from .wdr import (
|
from .wdr import (
|
||||||
WDRIE,
|
WDRIE,
|
||||||
|
WDRPageIE,
|
||||||
|
WDRElefantIE,
|
||||||
WDRMobileIE,
|
WDRMobileIE,
|
||||||
)
|
)
|
||||||
from .webcaster import (
|
from .webcaster import (
|
||||||
@ -1327,6 +1329,10 @@ from .xiami import (
|
|||||||
XiamiArtistIE,
|
XiamiArtistIE,
|
||||||
XiamiCollectionIE
|
XiamiCollectionIE
|
||||||
)
|
)
|
||||||
|
from .ximalaya import (
|
||||||
|
XimalayaIE,
|
||||||
|
XimalayaAlbumIE
|
||||||
|
)
|
||||||
from .xminus import XMinusIE
|
from .xminus import XMinusIE
|
||||||
from .xnxx import XNXXIE
|
from .xnxx import XNXXIE
|
||||||
from .xstream import XstreamIE
|
from .xstream import XstreamIE
|
||||||
|
@ -1,6 +1,8 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
int_or_none,
|
int_or_none,
|
||||||
@ -9,27 +11,34 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class GameStarIE(InfoExtractor):
|
class GameStarIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?gamestar\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
_VALID_URL = r'https?://(?:www\.)?game(?P<site>pro|star)\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
||||||
_TEST = {
|
_TESTS = [{
|
||||||
'url': 'http://www.gamestar.de/videos/trailer,3/hobbit-3-die-schlacht-der-fuenf-heere,76110.html',
|
'url': 'http://www.gamestar.de/videos/trailer,3/hobbit-3-die-schlacht-der-fuenf-heere,76110.html',
|
||||||
'md5': '96974ecbb7fd8d0d20fca5a00810cea7',
|
'md5': 'ee782f1f8050448c95c5cacd63bc851c',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '76110',
|
'id': '76110',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
|
'title': 'Hobbit 3: Die Schlacht der Fünf Heere - Teaser-Trailer zum dritten Teil',
|
||||||
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
|
'description': 'Der Teaser-Trailer zu Hobbit 3: Die Schlacht der Fünf Heere zeigt einige Szenen aus dem dritten Teil der Saga und kündigt den...',
|
||||||
'thumbnail': r're:^https?://.*\.jpg$',
|
'thumbnail': r're:^https?://.*\.jpg$',
|
||||||
'timestamp': 1406542020,
|
'timestamp': 1406542380,
|
||||||
'upload_date': '20140728',
|
'upload_date': '20140728',
|
||||||
'duration': 17
|
'duration': 17,
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.gamepro.de/videos/top-10-indie-spiele-fuer-nintendo-switch-video-tolle-nindies-games-zum-download,95316.html',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.gamestar.de/videos/top-10-indie-spiele-fuer-nintendo-switch-video-tolle-nindies-games-zum-download,95316.html',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
webpage = self._download_webpage(url, video_id)
|
site = mobj.group('site')
|
||||||
|
video_id = mobj.group('id')
|
||||||
|
|
||||||
url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
# TODO: there are multiple ld+json objects in the webpage,
|
# TODO: there are multiple ld+json objects in the webpage,
|
||||||
# while _search_json_ld finds only the first one
|
# while _search_json_ld finds only the first one
|
||||||
@ -37,16 +46,17 @@ class GameStarIE(InfoExtractor):
|
|||||||
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>[^<]+VideoObject[^<]+)</script>',
|
r'(?s)<script[^>]+type=(["\'])application/ld\+json\1[^>]*>(?P<json_ld>[^<]+VideoObject[^<]+)</script>',
|
||||||
webpage, 'JSON-LD', group='json_ld'), video_id)
|
webpage, 'JSON-LD', group='json_ld'), video_id)
|
||||||
info_dict = self._json_ld(json_ld, video_id)
|
info_dict = self._json_ld(json_ld, video_id)
|
||||||
info_dict['title'] = remove_end(info_dict['title'], ' - GameStar')
|
info_dict['title'] = remove_end(
|
||||||
|
info_dict['title'], ' - Game%s' % site.title())
|
||||||
|
|
||||||
view_count = json_ld.get('interactionCount')
|
view_count = int_or_none(json_ld.get('interactionCount'))
|
||||||
comment_count = int_or_none(self._html_search_regex(
|
comment_count = int_or_none(self._html_search_regex(
|
||||||
r'([0-9]+) Kommentare</span>', webpage, 'comment_count',
|
r'<span>Kommentare</span>\s*<span[^>]+class=["\']count[^>]+>\s*\(\s*([0-9]+)',
|
||||||
fatal=False))
|
webpage, 'comment count', fatal=False))
|
||||||
|
|
||||||
info_dict.update({
|
info_dict.update({
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'url': url,
|
'url': 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id,
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'view_count': view_count,
|
'view_count': view_count,
|
||||||
'comment_count': comment_count
|
'comment_count': comment_count
|
||||||
|
@ -10,6 +10,7 @@ from ..utils import (
|
|||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
|
try_get,
|
||||||
unsmuggle_url,
|
unsmuggle_url,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
)
|
)
|
||||||
@ -220,6 +221,12 @@ class LimelightBaseIE(InfoExtractor):
|
|||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
def _extract_info_helper(self, pc, mobile, i, metadata):
|
||||||
|
return self._extract_info(
|
||||||
|
try_get(pc, lambda x: x['playlistItems'][i]['streams'], list) or [],
|
||||||
|
try_get(mobile, lambda x: x['mediaList'][i]['mobileUrls'], list) or [],
|
||||||
|
metadata)
|
||||||
|
|
||||||
|
|
||||||
class LimelightMediaIE(LimelightBaseIE):
|
class LimelightMediaIE(LimelightBaseIE):
|
||||||
IE_NAME = 'limelight'
|
IE_NAME = 'limelight'
|
||||||
@ -282,10 +289,7 @@ class LimelightMediaIE(LimelightBaseIE):
|
|||||||
'getMobilePlaylistByMediaId', 'properties',
|
'getMobilePlaylistByMediaId', 'properties',
|
||||||
smuggled_data.get('source_url'))
|
smuggled_data.get('source_url'))
|
||||||
|
|
||||||
return self._extract_info(
|
return self._extract_info_helper(pc, mobile, 0, metadata)
|
||||||
pc['playlistItems'][0].get('streams', []),
|
|
||||||
mobile['mediaList'][0].get('mobileUrls', []) if mobile else [],
|
|
||||||
metadata)
|
|
||||||
|
|
||||||
|
|
||||||
class LimelightChannelIE(LimelightBaseIE):
|
class LimelightChannelIE(LimelightBaseIE):
|
||||||
@ -326,10 +330,7 @@ class LimelightChannelIE(LimelightBaseIE):
|
|||||||
'media', smuggled_data.get('source_url'))
|
'media', smuggled_data.get('source_url'))
|
||||||
|
|
||||||
entries = [
|
entries = [
|
||||||
self._extract_info(
|
self._extract_info_helper(pc, mobile, i, medias['media_list'][i])
|
||||||
pc['playlistItems'][i].get('streams', []),
|
|
||||||
mobile['mediaList'][i].get('mobileUrls', []) if mobile else [],
|
|
||||||
medias['media_list'][i])
|
|
||||||
for i in range(len(medias['media_list']))]
|
for i in range(len(medias['media_list']))]
|
||||||
|
|
||||||
return self.playlist_result(entries, channel_id, pc['title'])
|
return self.playlist_result(entries, channel_id, pc['title'])
|
||||||
|
@ -1,6 +1,8 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_str,
|
compat_str,
|
||||||
@ -18,7 +20,14 @@ from ..utils import (
|
|||||||
class PandoraTVIE(InfoExtractor):
|
class PandoraTVIE(InfoExtractor):
|
||||||
IE_NAME = 'pandora.tv'
|
IE_NAME = 'pandora.tv'
|
||||||
IE_DESC = '판도라TV'
|
IE_DESC = '판도라TV'
|
||||||
_VALID_URL = r'https?://(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?'
|
_VALID_URL = r'''(?x)
|
||||||
|
https?://
|
||||||
|
(?:
|
||||||
|
(?:www\.)?pandora\.tv/view/(?P<user_id>[^/]+)/(?P<id>\d+)| # new format
|
||||||
|
(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?| # old format
|
||||||
|
m\.pandora\.tv/?\? # mobile
|
||||||
|
)
|
||||||
|
'''
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://jp.channel.pandora.tv/channel/video.ptv?c1=&prgid=53294230&ch_userid=mikakim&ref=main&lot=cate_01_2',
|
'url': 'http://jp.channel.pandora.tv/channel/video.ptv?c1=&prgid=53294230&ch_userid=mikakim&ref=main&lot=cate_01_2',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
@ -53,9 +62,20 @@ class PandoraTVIE(InfoExtractor):
|
|||||||
# Test metadata only
|
# Test metadata only
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.pandora.tv/view/mikakim/53294230#36797454_new',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://m.pandora.tv/?c=view&ch_userid=mikakim&prgid=54600346',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
user_id = mobj.group('user_id')
|
||||||
|
video_id = mobj.group('id')
|
||||||
|
|
||||||
|
if not user_id or not video_id:
|
||||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||||
video_id = qs.get('prgid', [None])[0]
|
video_id = qs.get('prgid', [None])[0]
|
||||||
user_id = qs.get('ch_userid', [None])[0]
|
user_id = qs.get('ch_userid', [None])[0]
|
||||||
|
@ -1,38 +0,0 @@
|
|||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
from .wdr import WDRBaseIE
|
|
||||||
from ..utils import get_element_by_attribute
|
|
||||||
|
|
||||||
|
|
||||||
class SportschauIE(WDRBaseIE):
|
|
||||||
IE_NAME = 'Sportschau'
|
|
||||||
_VALID_URL = r'https?://(?:www\.)?sportschau\.de/(?:[^/]+/)+video-?(?P<id>[^/#?]+)\.html'
|
|
||||||
_TEST = {
|
|
||||||
'url': 'http://www.sportschau.de/uefaeuro2016/videos/video-dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100.html',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'mdb-1140188',
|
|
||||||
'display_id': 'dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': 'DFB-Team geht gut gelaunt ins Spiel gegen Polen',
|
|
||||||
'description': 'Vor dem zweiten Gruppenspiel gegen Polen herrscht gute Stimmung im deutschen Team. Insbesondere Bastian Schweinsteiger strotzt vor Optimismus nach seinem Tor gegen die Ukraine.',
|
|
||||||
'upload_date': '20160615',
|
|
||||||
},
|
|
||||||
'skip': 'Geo-restricted to Germany',
|
|
||||||
}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
title = get_element_by_attribute('class', 'headline', webpage)
|
|
||||||
description = self._html_search_meta('description', webpage, 'description')
|
|
||||||
|
|
||||||
info = self._extract_wdr_video(webpage, video_id)
|
|
||||||
|
|
||||||
info.update({
|
|
||||||
'title': title,
|
|
||||||
'description': description,
|
|
||||||
})
|
|
||||||
|
|
||||||
return info
|
|
@ -273,6 +273,8 @@ class TVPlayIE(InfoExtractor):
|
|||||||
'ext': ext,
|
'ext': ext,
|
||||||
}
|
}
|
||||||
if video_url.startswith('rtmp'):
|
if video_url.startswith('rtmp'):
|
||||||
|
if smuggled_data.get('skip_rtmp'):
|
||||||
|
continue
|
||||||
m = re.search(
|
m = re.search(
|
||||||
r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', video_url)
|
r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', video_url)
|
||||||
if not m:
|
if not m:
|
||||||
@ -434,6 +436,10 @@ class ViafreeIE(InfoExtractor):
|
|||||||
return self.url_result(
|
return self.url_result(
|
||||||
smuggle_url(
|
smuggle_url(
|
||||||
'mtg:%s' % video_id,
|
'mtg:%s' % video_id,
|
||||||
{'geo_countries': [
|
{
|
||||||
compat_urlparse.urlparse(url).netloc.rsplit('.', 1)[-1]]}),
|
'geo_countries': [
|
||||||
|
compat_urlparse.urlparse(url).netloc.rsplit('.', 1)[-1]],
|
||||||
|
# rtmp host mtgfs.fplive.net for viafree is unresolvable
|
||||||
|
'skip_rtmp': True,
|
||||||
|
}),
|
||||||
ie=TVPlayIE.ie_key(), video_id=video_id)
|
ie=TVPlayIE.ie_key(), video_id=video_id)
|
||||||
|
@ -4,49 +4,50 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import (
|
||||||
|
compat_str,
|
||||||
|
compat_urlparse,
|
||||||
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
js_to_json,
|
js_to_json,
|
||||||
strip_jsonp,
|
strip_jsonp,
|
||||||
|
try_get,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
update_url_query,
|
update_url_query,
|
||||||
urlhandle_detect_ext,
|
urlhandle_detect_ext,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class WDRBaseIE(InfoExtractor):
|
class WDRIE(InfoExtractor):
|
||||||
def _extract_wdr_video(self, webpage, display_id):
|
_VALID_URL = r'https?://deviceids-medp\.wdr\.de/ondemand/\d+/(?P<id>\d+)\.js'
|
||||||
# for wdr.de the data-extension is in a tag with the class "mediaLink"
|
_GEO_COUNTRIES = ['DE']
|
||||||
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
|
_TEST = {
|
||||||
# for wdrmaus, in a tag with the class "videoButton" (previously a link
|
'url': 'http://deviceids-medp.wdr.de/ondemand/155/1557833.js',
|
||||||
# to the page in a multiline "videoLink"-tag)
|
'info_dict': {
|
||||||
json_metadata = self._html_search_regex(
|
'id': 'mdb-1557833',
|
||||||
r'''(?sx)class=
|
'ext': 'mp4',
|
||||||
(?:
|
'title': 'Biathlon-Staffel verpasst Podest bei Olympia-Generalprobe',
|
||||||
(["\'])(?:mediaLink|wdrrPlayerPlayBtn|videoButton)\b.*?\1[^>]+|
|
'upload_date': '20180112',
|
||||||
(["\'])videoLink\b.*?\2[\s]*>\n[^\n]*
|
},
|
||||||
)data-extension=(["\'])(?P<data>(?:(?!\3).)+)\3
|
}
|
||||||
''',
|
|
||||||
webpage, 'media link', default=None, group='data')
|
|
||||||
|
|
||||||
if not json_metadata:
|
def _real_extract(self, url):
|
||||||
return
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
media_link_obj = self._parse_json(json_metadata, display_id,
|
|
||||||
transform_source=js_to_json)
|
|
||||||
jsonp_url = media_link_obj['mediaObj']['url']
|
|
||||||
|
|
||||||
metadata = self._download_json(
|
metadata = self._download_json(
|
||||||
jsonp_url, display_id, transform_source=strip_jsonp)
|
url, video_id, transform_source=strip_jsonp)
|
||||||
|
|
||||||
metadata_tracker_data = metadata['trackerData']
|
is_live = metadata.get('mediaType') == 'live'
|
||||||
metadata_media_resource = metadata['mediaResource']
|
|
||||||
|
tracker_data = metadata['trackerData']
|
||||||
|
media_resource = metadata['mediaResource']
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
|
|
||||||
# check if the metadata contains a direct URL to a file
|
# check if the metadata contains a direct URL to a file
|
||||||
for kind, media_resource in metadata_media_resource.items():
|
for kind, media_resource in media_resource.items():
|
||||||
if kind not in ('dflt', 'alt'):
|
if kind not in ('dflt', 'alt'):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
@ -57,13 +58,13 @@ class WDRBaseIE(InfoExtractor):
|
|||||||
ext = determine_ext(medium_url)
|
ext = determine_ext(medium_url)
|
||||||
if ext == 'm3u8':
|
if ext == 'm3u8':
|
||||||
formats.extend(self._extract_m3u8_formats(
|
formats.extend(self._extract_m3u8_formats(
|
||||||
medium_url, display_id, 'mp4', 'm3u8_native',
|
medium_url, video_id, 'mp4', 'm3u8_native',
|
||||||
m3u8_id='hls'))
|
m3u8_id='hls'))
|
||||||
elif ext == 'f4m':
|
elif ext == 'f4m':
|
||||||
manifest_url = update_url_query(
|
manifest_url = update_url_query(
|
||||||
medium_url, {'hdcore': '3.2.0', 'plugin': 'aasp-3.2.0.77.18'})
|
medium_url, {'hdcore': '3.2.0', 'plugin': 'aasp-3.2.0.77.18'})
|
||||||
formats.extend(self._extract_f4m_formats(
|
formats.extend(self._extract_f4m_formats(
|
||||||
manifest_url, display_id, f4m_id='hds', fatal=False))
|
manifest_url, video_id, f4m_id='hds', fatal=False))
|
||||||
elif ext == 'smil':
|
elif ext == 'smil':
|
||||||
formats.extend(self._extract_smil_formats(
|
formats.extend(self._extract_smil_formats(
|
||||||
medium_url, 'stream', fatal=False))
|
medium_url, 'stream', fatal=False))
|
||||||
@ -73,7 +74,7 @@ class WDRBaseIE(InfoExtractor):
|
|||||||
}
|
}
|
||||||
if ext == 'unknown_video':
|
if ext == 'unknown_video':
|
||||||
urlh = self._request_webpage(
|
urlh = self._request_webpage(
|
||||||
medium_url, display_id, note='Determining extension')
|
medium_url, video_id, note='Determining extension')
|
||||||
ext = urlhandle_detect_ext(urlh)
|
ext = urlhandle_detect_ext(urlh)
|
||||||
a_format['ext'] = ext
|
a_format['ext'] = ext
|
||||||
formats.append(a_format)
|
formats.append(a_format)
|
||||||
@ -81,30 +82,30 @@ class WDRBaseIE(InfoExtractor):
|
|||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
caption_url = metadata_media_resource.get('captionURL')
|
caption_url = media_resource.get('captionURL')
|
||||||
if caption_url:
|
if caption_url:
|
||||||
subtitles['de'] = [{
|
subtitles['de'] = [{
|
||||||
'url': caption_url,
|
'url': caption_url,
|
||||||
'ext': 'ttml',
|
'ext': 'ttml',
|
||||||
}]
|
}]
|
||||||
|
|
||||||
title = metadata_tracker_data['trackerClipTitle']
|
title = tracker_data['trackerClipTitle']
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': metadata_tracker_data.get('trackerClipId', display_id),
|
'id': tracker_data.get('trackerClipId', video_id),
|
||||||
'display_id': display_id,
|
'title': self._live_title(title) if is_live else title,
|
||||||
'title': title,
|
'alt_title': tracker_data.get('trackerClipSubcategory'),
|
||||||
'alt_title': metadata_tracker_data.get('trackerClipSubcategory'),
|
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'subtitles': subtitles,
|
'subtitles': subtitles,
|
||||||
'upload_date': unified_strdate(metadata_tracker_data.get('trackerClipAirTime')),
|
'upload_date': unified_strdate(tracker_data.get('trackerClipAirTime')),
|
||||||
|
'is_live': is_live,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
class WDRIE(WDRBaseIE):
|
class WDRPageIE(InfoExtractor):
|
||||||
_CURRENT_MAUS_URL = r'https?://(?:www\.)wdrmaus.de/(?:[^/]+/){1,2}[^/?#]+\.php5'
|
_CURRENT_MAUS_URL = r'https?://(?:www\.)wdrmaus.de/(?:[^/]+/){1,2}[^/?#]+\.php5'
|
||||||
_PAGE_REGEX = r'/(?:mediathek/)?[^/]+/(?P<type>[^/]+)/(?P<display_id>.+)\.html'
|
_PAGE_REGEX = r'/(?:mediathek/)?(?:[^/]+/)*(?P<display_id>[^/]+)\.html'
|
||||||
_VALID_URL = r'(?P<page_url>https?://(?:www\d\.)?wdr\d?\.de)' + _PAGE_REGEX + '|' + _CURRENT_MAUS_URL
|
_VALID_URL = r'https?://(?:www\d?\.)?(?:wdr\d?|sportschau)\.de' + _PAGE_REGEX + '|' + _CURRENT_MAUS_URL
|
||||||
|
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
{
|
{
|
||||||
@ -124,6 +125,7 @@ class WDRIE(WDRBaseIE):
|
|||||||
'ext': 'ttml',
|
'ext': 'ttml',
|
||||||
}]},
|
}]},
|
||||||
},
|
},
|
||||||
|
'skip': 'HTTP Error 404: Not Found',
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
'url': 'http://www1.wdr.de/mediathek/audio/wdr3/wdr3-gespraech-am-samstag/audio-schriftstellerin-juli-zeh-100.html',
|
'url': 'http://www1.wdr.de/mediathek/audio/wdr3/wdr3-gespraech-am-samstag/audio-schriftstellerin-juli-zeh-100.html',
|
||||||
@ -139,19 +141,17 @@ class WDRIE(WDRBaseIE):
|
|||||||
'is_live': False,
|
'is_live': False,
|
||||||
'subtitles': {}
|
'subtitles': {}
|
||||||
},
|
},
|
||||||
|
'skip': 'HTTP Error 404: Not Found',
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
'url': 'http://www1.wdr.de/mediathek/video/live/index.html',
|
'url': 'http://www1.wdr.de/mediathek/video/live/index.html',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'mdb-103364',
|
'id': 'mdb-1406149',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'display_id': 'index',
|
'title': r're:^WDR Fernsehen im Livestream \(nur in Deutschland erreichbar\) [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||||
'title': r're:^WDR Fernsehen im Livestream [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
|
||||||
'alt_title': 'WDR Fernsehen Live',
|
'alt_title': 'WDR Fernsehen Live',
|
||||||
'upload_date': None,
|
'upload_date': '20150101',
|
||||||
'description': 'md5:ae2ff888510623bf8d4b115f95a9b7c9',
|
|
||||||
'is_live': True,
|
'is_live': True,
|
||||||
'subtitles': {}
|
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True, # m3u8 download
|
'skip_download': True, # m3u8 download
|
||||||
@ -159,19 +159,18 @@ class WDRIE(WDRBaseIE):
|
|||||||
},
|
},
|
||||||
{
|
{
|
||||||
'url': 'http://www1.wdr.de/mediathek/video/sendungen/aktuelle-stunde/aktuelle-stunde-120.html',
|
'url': 'http://www1.wdr.de/mediathek/video/sendungen/aktuelle-stunde/aktuelle-stunde-120.html',
|
||||||
'playlist_mincount': 8,
|
'playlist_mincount': 7,
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'aktuelle-stunde/aktuelle-stunde-120',
|
'id': 'aktuelle-stunde-120',
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
'url': 'http://www.wdrmaus.de/aktuelle-sendung/index.php5',
|
'url': 'http://www.wdrmaus.de/aktuelle-sendung/index.php5',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'mdb-1323501',
|
'id': 'mdb-1552552',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'upload_date': 're:^[0-9]{8}$',
|
'upload_date': 're:^[0-9]{8}$',
|
||||||
'title': 're:^Die Sendung mit der Maus vom [0-9.]{10}$',
|
'title': 're:^Die Sendung mit der Maus vom [0-9.]{10}$',
|
||||||
'description': 'Die Seite mit der Maus -',
|
|
||||||
},
|
},
|
||||||
'skip': 'The id changes from week to week because of the new episode'
|
'skip': 'The id changes from week to week because of the new episode'
|
||||||
},
|
},
|
||||||
@ -183,7 +182,6 @@ class WDRIE(WDRBaseIE):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'upload_date': '20130919',
|
'upload_date': '20130919',
|
||||||
'title': 'Sachgeschichte - Achterbahn ',
|
'title': 'Sachgeschichte - Achterbahn ',
|
||||||
'description': 'Die Seite mit der Maus -',
|
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -191,52 +189,114 @@ class WDRIE(WDRBaseIE):
|
|||||||
# Live stream, MD5 unstable
|
# Live stream, MD5 unstable
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'mdb-869971',
|
'id': 'mdb-869971',
|
||||||
'ext': 'flv',
|
'ext': 'mp4',
|
||||||
'title': 'COSMO Livestream',
|
'title': r're:^COSMO Livestream [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
|
||||||
'description': 'md5:2309992a6716c347891c045be50992e4',
|
|
||||||
'upload_date': '20160101',
|
'upload_date': '20160101',
|
||||||
},
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True, # m3u8 download
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://www.sportschau.de/handballem2018/handball-nationalmannschaft-em-stolperstein-vorrunde-100.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'mdb-1556012',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'DHB-Vizepräsident Bob Hanning - "Die Weltspitze ist extrem breit"',
|
||||||
|
'upload_date': '20180111',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://www.sportschau.de/handballem2018/audio-vorschau---die-handball-em-startet-mit-grossem-favoritenfeld-100.html',
|
||||||
|
'only_matching': True,
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
url_type = mobj.group('type')
|
|
||||||
page_url = mobj.group('page_url')
|
|
||||||
display_id = mobj.group('display_id')
|
display_id = mobj.group('display_id')
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
info_dict = self._extract_wdr_video(webpage, display_id)
|
entries = []
|
||||||
|
|
||||||
if not info_dict:
|
# Article with several videos
|
||||||
|
|
||||||
|
# for wdr.de the data-extension is in a tag with the class "mediaLink"
|
||||||
|
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
|
||||||
|
# for wdrmaus, in a tag with the class "videoButton" (previously a link
|
||||||
|
# to the page in a multiline "videoLink"-tag)
|
||||||
|
for mobj in re.finditer(
|
||||||
|
r'''(?sx)class=
|
||||||
|
(?:
|
||||||
|
(["\'])(?:mediaLink|wdrrPlayerPlayBtn|videoButton)\b.*?\1[^>]+|
|
||||||
|
(["\'])videoLink\b.*?\2[\s]*>\n[^\n]*
|
||||||
|
)data-extension=(["\'])(?P<data>(?:(?!\3).)+)\3
|
||||||
|
''', webpage):
|
||||||
|
media_link_obj = self._parse_json(
|
||||||
|
mobj.group('data'), display_id, transform_source=js_to_json,
|
||||||
|
fatal=False)
|
||||||
|
if not media_link_obj:
|
||||||
|
continue
|
||||||
|
jsonp_url = try_get(
|
||||||
|
media_link_obj, lambda x: x['mediaObj']['url'], compat_str)
|
||||||
|
if jsonp_url:
|
||||||
|
entries.append(self.url_result(jsonp_url, ie=WDRIE.ie_key()))
|
||||||
|
|
||||||
|
# Playlist (e.g. https://www1.wdr.de/mediathek/video/sendungen/aktuelle-stunde/aktuelle-stunde-120.html)
|
||||||
|
if not entries:
|
||||||
entries = [
|
entries = [
|
||||||
self.url_result(page_url + href[0], 'WDR')
|
self.url_result(
|
||||||
for href in re.findall(
|
compat_urlparse.urljoin(url, mobj.group('href')),
|
||||||
r'<a href="(%s)"[^>]+data-extension=' % self._PAGE_REGEX,
|
ie=WDRPageIE.ie_key())
|
||||||
webpage)
|
for mobj in re.finditer(
|
||||||
|
r'<a[^>]+\bhref=(["\'])(?P<href>(?:(?!\1).)+)\1[^>]+\bdata-extension=',
|
||||||
|
webpage) if re.match(self._PAGE_REGEX, mobj.group('href'))
|
||||||
]
|
]
|
||||||
|
|
||||||
if entries: # Playlist page
|
|
||||||
return self.playlist_result(entries, playlist_id=display_id)
|
return self.playlist_result(entries, playlist_id=display_id)
|
||||||
|
|
||||||
raise ExtractorError('No downloadable streams found', expected=True)
|
|
||||||
|
|
||||||
is_live = url_type == 'live'
|
class WDRElefantIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)wdrmaus\.de/elefantenseite/#(?P<id>.+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.wdrmaus.de/elefantenseite/#folge_ostern_2015',
|
||||||
|
'info_dict': {
|
||||||
|
'title': 'Folge Oster-Spezial 2015',
|
||||||
|
'id': 'mdb-1088195',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'age_limit': None,
|
||||||
|
'upload_date': '20150406'
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
if is_live:
|
def _real_extract(self, url):
|
||||||
info_dict.update({
|
display_id = self._match_id(url)
|
||||||
'title': self._live_title(info_dict['title']),
|
|
||||||
'upload_date': None,
|
|
||||||
})
|
|
||||||
elif 'upload_date' not in info_dict:
|
|
||||||
info_dict['upload_date'] = unified_strdate(self._html_search_meta('DC.Date', webpage, 'upload date'))
|
|
||||||
|
|
||||||
info_dict.update({
|
# Table of Contents seems to always be at this address, so fetch it directly.
|
||||||
'description': self._html_search_meta('Description', webpage),
|
# The website fetches configurationJS.php5, which links to tableOfContentsJS.php5.
|
||||||
'is_live': is_live,
|
table_of_contents = self._download_json(
|
||||||
})
|
'https://www.wdrmaus.de/elefantenseite/data/tableOfContentsJS.php5',
|
||||||
|
display_id)
|
||||||
return info_dict
|
if display_id not in table_of_contents:
|
||||||
|
raise ExtractorError(
|
||||||
|
'No entry in site\'s table of contents for this URL. '
|
||||||
|
'Is the fragment part of the URL (after the #) correct?',
|
||||||
|
expected=True)
|
||||||
|
xml_metadata_path = table_of_contents[display_id]['xmlPath']
|
||||||
|
xml_metadata = self._download_xml(
|
||||||
|
'https://www.wdrmaus.de/elefantenseite/' + xml_metadata_path,
|
||||||
|
display_id)
|
||||||
|
zmdb_url_element = xml_metadata.find('./movie/zmdb_url')
|
||||||
|
if zmdb_url_element is None:
|
||||||
|
raise ExtractorError(
|
||||||
|
'%s is not a video' % display_id, expected=True)
|
||||||
|
return self.url_result(zmdb_url_element.text, ie=WDRIE.ie_key())
|
||||||
|
|
||||||
|
|
||||||
class WDRMobileIE(InfoExtractor):
|
class WDRMobileIE(InfoExtractor):
|
||||||
|
233
youtube_dl/extractor/ximalaya.py
Normal file
233
youtube_dl/extractor/ximalaya.py
Normal file
@ -0,0 +1,233 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class XimalayaBaseIE(InfoExtractor):
|
||||||
|
_GEO_COUNTRIES = ['CN']
|
||||||
|
|
||||||
|
|
||||||
|
class XimalayaIE(XimalayaBaseIE):
|
||||||
|
IE_NAME = 'ximalaya'
|
||||||
|
IE_DESC = '喜马拉雅FM'
|
||||||
|
_VALID_URL = r'https?://(?:www\.|m\.)?ximalaya\.com/(?P<uid>[0-9]+)/sound/(?P<id>[0-9]+)'
|
||||||
|
_USER_URL_FORMAT = '%s://www.ximalaya.com/zhubo/%i/'
|
||||||
|
_TESTS = [
|
||||||
|
{
|
||||||
|
'url': 'http://www.ximalaya.com/61425525/sound/47740352/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '47740352',
|
||||||
|
'ext': 'm4a',
|
||||||
|
'uploader': '小彬彬爱听书',
|
||||||
|
'uploader_id': 61425525,
|
||||||
|
'uploader_url': 'http://www.ximalaya.com/zhubo/61425525/',
|
||||||
|
'title': '261.唐诗三百首.卷八.送孟浩然之广陵.李白',
|
||||||
|
'description': "contains:《送孟浩然之广陵》\n作者:李白\n故人西辞黄鹤楼,烟花三月下扬州。\n孤帆远影碧空尽,惟见长江天际流。",
|
||||||
|
'thumbnails': [
|
||||||
|
{
|
||||||
|
'name': 'cover_url',
|
||||||
|
'url': r're:^https?://.*\.jpg$',
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'name': 'cover_url_142',
|
||||||
|
'url': r're:^https?://.*\.jpg$',
|
||||||
|
'width': 180,
|
||||||
|
'height': 180
|
||||||
|
}
|
||||||
|
],
|
||||||
|
'categories': ['renwen', '人文'],
|
||||||
|
'duration': 93,
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://m.ximalaya.com/61425525/sound/47740352/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '47740352',
|
||||||
|
'ext': 'm4a',
|
||||||
|
'uploader': '小彬彬爱听书',
|
||||||
|
'uploader_id': 61425525,
|
||||||
|
'uploader_url': 'http://www.ximalaya.com/zhubo/61425525/',
|
||||||
|
'title': '261.唐诗三百首.卷八.送孟浩然之广陵.李白',
|
||||||
|
'description': "contains:《送孟浩然之广陵》\n作者:李白\n故人西辞黄鹤楼,烟花三月下扬州。\n孤帆远影碧空尽,惟见长江天际流。",
|
||||||
|
'thumbnails': [
|
||||||
|
{
|
||||||
|
'name': 'cover_url',
|
||||||
|
'url': r're:^https?://.*\.jpg$',
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'name': 'cover_url_142',
|
||||||
|
'url': r're:^https?://.*\.jpg$',
|
||||||
|
'width': 180,
|
||||||
|
'height': 180
|
||||||
|
}
|
||||||
|
],
|
||||||
|
'categories': ['renwen', '人文'],
|
||||||
|
'duration': 93,
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'https://www.ximalaya.com/11045267/sound/15705996/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '15705996',
|
||||||
|
'ext': 'm4a',
|
||||||
|
'uploader': '李延隆老师',
|
||||||
|
'uploader_id': 11045267,
|
||||||
|
'uploader_url': 'https://www.ximalaya.com/zhubo/11045267/',
|
||||||
|
'title': 'Lesson 1 Excuse me!',
|
||||||
|
'description': "contains:Listen to the tape then answer\xa0this question. Whose handbag is it?\n"
|
||||||
|
"听录音,然后回答问题,这是谁的手袋?",
|
||||||
|
'thumbnails': [
|
||||||
|
{
|
||||||
|
'name': 'cover_url',
|
||||||
|
'url': r're:^https?://.*\.jpg$',
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'name': 'cover_url_142',
|
||||||
|
'url': r're:^https?://.*\.jpg$',
|
||||||
|
'width': 180,
|
||||||
|
'height': 180
|
||||||
|
}
|
||||||
|
],
|
||||||
|
'categories': ['train', '外语'],
|
||||||
|
'duration': 40,
|
||||||
|
'view_count': int,
|
||||||
|
'like_count': int,
|
||||||
|
}
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
|
||||||
|
is_m = 'm.ximalaya' in url
|
||||||
|
scheme = 'https' if url.startswith('https') else 'http'
|
||||||
|
|
||||||
|
audio_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, audio_id,
|
||||||
|
note='Download sound page for %s' % audio_id,
|
||||||
|
errnote='Unable to get sound page')
|
||||||
|
|
||||||
|
audio_info_file = '%s://m.ximalaya.com/tracks/%s.json' % (scheme, audio_id)
|
||||||
|
audio_info = self._download_json(audio_info_file, audio_id,
|
||||||
|
'Downloading info json %s' % audio_info_file,
|
||||||
|
'Unable to download info file')
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
for bps, k in (('24k', 'play_path_32'), ('64k', 'play_path_64')):
|
||||||
|
if audio_info.get(k):
|
||||||
|
formats.append({
|
||||||
|
'format_id': bps,
|
||||||
|
'url': audio_info[k],
|
||||||
|
})
|
||||||
|
|
||||||
|
thumbnails = []
|
||||||
|
for k in audio_info.keys():
|
||||||
|
# cover pics kyes like: cover_url', 'cover_url_142'
|
||||||
|
if k.startswith('cover_url'):
|
||||||
|
thumbnail = {'name': k, 'url': audio_info[k]}
|
||||||
|
if k == 'cover_url_142':
|
||||||
|
thumbnail['width'] = 180
|
||||||
|
thumbnail['height'] = 180
|
||||||
|
thumbnails.append(thumbnail)
|
||||||
|
|
||||||
|
audio_uploader_id = audio_info.get('uid')
|
||||||
|
|
||||||
|
if is_m:
|
||||||
|
audio_description = self._html_search_regex(r'(?s)<section\s+class=["\']content[^>]+>(.+?)</section>',
|
||||||
|
webpage, 'audio_description', fatal=False)
|
||||||
|
else:
|
||||||
|
audio_description = self._html_search_regex(r'(?s)<div\s+class=["\']rich_intro[^>]*>(.+?</article>)',
|
||||||
|
webpage, 'audio_description', fatal=False)
|
||||||
|
|
||||||
|
if not audio_description:
|
||||||
|
audio_description_file = '%s://www.ximalaya.com/sounds/%s/rich_intro' % (scheme, audio_id)
|
||||||
|
audio_description = self._download_webpage(audio_description_file, audio_id,
|
||||||
|
note='Downloading description file %s' % audio_description_file,
|
||||||
|
errnote='Unable to download descrip file',
|
||||||
|
fatal=False)
|
||||||
|
audio_description = audio_description.strip() if audio_description else None
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': audio_id,
|
||||||
|
'uploader': audio_info.get('nickname'),
|
||||||
|
'uploader_id': audio_uploader_id,
|
||||||
|
'uploader_url': self._USER_URL_FORMAT % (scheme, audio_uploader_id) if audio_uploader_id else None,
|
||||||
|
'title': audio_info['title'],
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
'description': audio_description,
|
||||||
|
'categories': list(filter(None, (audio_info.get('category_name'), audio_info.get('category_title')))),
|
||||||
|
'duration': audio_info.get('duration'),
|
||||||
|
'view_count': audio_info.get('play_count'),
|
||||||
|
'like_count': audio_info.get('favorites_count'),
|
||||||
|
'formats': formats,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class XimalayaAlbumIE(XimalayaBaseIE):
|
||||||
|
IE_NAME = 'ximalaya:album'
|
||||||
|
IE_DESC = '喜马拉雅FM 专辑'
|
||||||
|
_VALID_URL = r'https?://(?:www\.|m\.)?ximalaya\.com/(?P<uid>[0-9]+)/album/(?P<id>[0-9]+)'
|
||||||
|
_TEMPLATE_URL = '%s://www.ximalaya.com/%s/album/%s/'
|
||||||
|
_BASE_URL_TEMPL = '%s://www.ximalaya.com%s'
|
||||||
|
_LIST_VIDEO_RE = r'<a[^>]+?href="(?P<url>/%s/sound/(?P<id>\d+)/?)"[^>]+?title="(?P<title>[^>]+)">'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://www.ximalaya.com/61425525/album/5534601/',
|
||||||
|
'info_dict': {
|
||||||
|
'title': '唐诗三百首(含赏析)',
|
||||||
|
'id': '5534601',
|
||||||
|
},
|
||||||
|
'playlist_count': 312,
|
||||||
|
}, {
|
||||||
|
'url': 'http://m.ximalaya.com/61425525/album/5534601',
|
||||||
|
'info_dict': {
|
||||||
|
'title': '唐诗三百首(含赏析)',
|
||||||
|
'id': '5534601',
|
||||||
|
},
|
||||||
|
'playlist_count': 312,
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
self.scheme = scheme = 'https' if url.startswith('https') else 'http'
|
||||||
|
|
||||||
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
uid, playlist_id = mobj.group('uid'), mobj.group('id')
|
||||||
|
|
||||||
|
webpage = self._download_webpage(self._TEMPLATE_URL % (scheme, uid, playlist_id), playlist_id,
|
||||||
|
note='Download album page for %s' % playlist_id,
|
||||||
|
errnote='Unable to get album info')
|
||||||
|
|
||||||
|
title = self._html_search_regex(r'detailContent_title[^>]*><h1(?:[^>]+)?>([^<]+)</h1>',
|
||||||
|
webpage, 'title', fatal=False)
|
||||||
|
|
||||||
|
return self.playlist_result(self._entries(webpage, playlist_id, uid), playlist_id, title)
|
||||||
|
|
||||||
|
def _entries(self, page, playlist_id, uid):
|
||||||
|
html = page
|
||||||
|
for page_num in itertools.count(1):
|
||||||
|
for entry in self._process_page(html, uid):
|
||||||
|
yield entry
|
||||||
|
|
||||||
|
next_url = self._search_regex(r'<a\s+href=(["\'])(?P<more>[\S]+)\1[^>]+rel=(["\'])next\3',
|
||||||
|
html, 'list_next_url', default=None, group='more')
|
||||||
|
if not next_url:
|
||||||
|
break
|
||||||
|
|
||||||
|
next_full_url = self._BASE_URL_TEMPL % (self.scheme, next_url)
|
||||||
|
html = self._download_webpage(next_full_url, playlist_id)
|
||||||
|
|
||||||
|
def _process_page(self, html, uid):
|
||||||
|
find_from = html.index('album_soundlist')
|
||||||
|
for mobj in re.finditer(self._LIST_VIDEO_RE % uid, html[find_from:]):
|
||||||
|
yield self.url_result(self._BASE_URL_TEMPL % (self.scheme, mobj.group('url')),
|
||||||
|
XimalayaIE.ie_key(),
|
||||||
|
mobj.group('id'),
|
||||||
|
mobj.group('title'))
|
@ -1810,7 +1810,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
|||||||
'url': video_info['conn'][0],
|
'url': video_info['conn'][0],
|
||||||
'player_url': player_url,
|
'player_url': player_url,
|
||||||
}]
|
}]
|
||||||
elif len(video_info.get('url_encoded_fmt_stream_map', [''])[0]) >= 1 or len(video_info.get('adaptive_fmts', [''])[0]) >= 1:
|
elif not is_live and (len(video_info.get('url_encoded_fmt_stream_map', [''])[0]) >= 1 or len(video_info.get('adaptive_fmts', [''])[0]) >= 1):
|
||||||
encoded_url_map = video_info.get('url_encoded_fmt_stream_map', [''])[0] + ',' + video_info.get('adaptive_fmts', [''])[0]
|
encoded_url_map = video_info.get('url_encoded_fmt_stream_map', [''])[0] + ',' + video_info.get('adaptive_fmts', [''])[0]
|
||||||
if 'rtmpe%3Dyes' in encoded_url_map:
|
if 'rtmpe%3Dyes' in encoded_url_map:
|
||||||
raise ExtractorError('rtmpe downloads are not supported, see https://github.com/rg3/youtube-dl/issues/343 for more information.', expected=True)
|
raise ExtractorError('rtmpe downloads are not supported, see https://github.com/rg3/youtube-dl/issues/343 for more information.', expected=True)
|
||||||
|
@ -1,3 +1,3 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
__version__ = '2018.01.07'
|
__version__ = '2018.01.14'
|
||||||
|
Loading…
x
Reference in New Issue
Block a user