Merge branch 'master' into makotv
This commit is contained in:
commit
7ed08c233c
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
@ -18,7 +18,7 @@ title: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.09.12.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.05. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
|
||||
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a broken site support
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.09.12.1**
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.11.05**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||
- [ ] I've searched the bugtracker for similar issues including closed ones
|
||||
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2019.09.12.1
|
||||
[debug] youtube-dl version 2019.11.05
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
@ -19,7 +19,7 @@ labels: 'site-support-request'
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.09.12.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.05. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a new site support request
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.09.12.1**
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.11.05**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that none of provided URLs violate any copyrights
|
||||
- [ ] I've searched the bugtracker for similar site support requests including closed ones
|
||||
|
@ -18,13 +18,13 @@ title: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.09.12.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.05. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||
- Finally, put x into all relevant boxes (like this [x])
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a site feature request
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.09.12.1**
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.11.05**
|
||||
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
|
||||
|
||||
|
||||
|
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
@ -18,7 +18,7 @@ title: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.09.12.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.05. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
|
||||
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a broken site support issue
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.09.12.1**
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.11.05**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||
- [ ] I've searched the bugtracker for similar bug reports including closed ones
|
||||
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2019.09.12.1
|
||||
[debug] youtube-dl version 2019.11.05
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
@ -19,13 +19,13 @@ labels: 'request'
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.09.12.1. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.11.05. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
|
||||
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
|
||||
- Finally, put x into all relevant boxes (like this [x])
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a feature request
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.09.12.1**
|
||||
- [ ] I've verified that I'm running youtube-dl version **2019.11.05**
|
||||
- [ ] I've searched the bugtracker for similar feature requests including closed ones
|
||||
|
||||
|
||||
|
@ -21,6 +21,12 @@ matrix:
|
||||
- python: 3.7
|
||||
dist: xenial
|
||||
env: YTDL_TEST_SET=download
|
||||
- python: 3.8
|
||||
dist: xenial
|
||||
env: YTDL_TEST_SET=core
|
||||
- python: 3.8
|
||||
dist: xenial
|
||||
env: YTDL_TEST_SET=download
|
||||
- python: 3.8-dev
|
||||
dist: xenial
|
||||
env: YTDL_TEST_SET=core
|
||||
|
182
ChangeLog
182
ChangeLog
@ -1,3 +1,185 @@
|
||||
version 2019.11.05
|
||||
|
||||
Extractors
|
||||
+ [scte] Add support for learning.scte.org (#22975)
|
||||
+ [msn] Add support for Vidible and AOL embeds (#22195, #22227)
|
||||
* [myspass] Fix video URL extraction and improve metadata extraction (#22448)
|
||||
* [jamendo] Improve extraction
|
||||
* Fix album extraction (#18564)
|
||||
* Improve metadata extraction (#18565, #21379)
|
||||
* [mediaset] Relax URL guid matching (#18352)
|
||||
+ [mediaset] Extract unprotected M3U and MPD manifests (#17204)
|
||||
* [telegraaf] Fix extraction
|
||||
+ [bellmedia] Add support for marilyn.ca videos (#22193)
|
||||
* [stv] Fix extraction (#22928)
|
||||
- [iconosquare] Remove extractor
|
||||
- [keek] Remove extractor
|
||||
- [gameone] Remove extractor (#21778)
|
||||
- [flipagram] Remove extractor
|
||||
- [bambuser] Remove extractor
|
||||
* [wistia] Reduce embed extraction false positives
|
||||
+ [wistia] Add support for inline embeds (#22931)
|
||||
- [go90] Remove extractor
|
||||
* [kakao] Remove raw request
|
||||
+ [kakao] Extract format total bitrate
|
||||
* [daum] Fix VOD and Clip extracton (#15015)
|
||||
* [kakao] Improve extraction
|
||||
+ Add support for embed URLs
|
||||
+ Add support for Kakao Legacy vid based embed URLs
|
||||
* Only extract fields used for extraction
|
||||
* Strip description and extract tags
|
||||
* [mixcloud] Fix cloudcast data extraction (#22821)
|
||||
* [yahoo] Improve extraction
|
||||
+ Add support for live streams (#3597, #3779, #22178)
|
||||
* Bypass cookie consent page for european domains (#16948, #22576)
|
||||
+ Add generic support for embeds (#20332)
|
||||
* [tv2] Fix and improve extraction (#22787)
|
||||
+ [tv2dk] Add support for TV2 DK sites
|
||||
* [onet] Improve extraction …
|
||||
+ Add support for onet100.vod.pl
|
||||
+ Extract m3u8 formats
|
||||
* Correct audio only format info
|
||||
* [fox9] Fix extraction
|
||||
|
||||
|
||||
version 2019.10.29
|
||||
|
||||
Core
|
||||
* [utils] Actualize major IPv4 address blocks per country
|
||||
|
||||
Extractors
|
||||
+ [go] Add support for abc.com and freeform.com (#22823, #22864)
|
||||
+ [mtv] Add support for mtvjapan.com
|
||||
* [mtv] Fix extraction for mtv.de (#22113)
|
||||
* [videodetective] Fix extraction
|
||||
* [internetvideoarchive] Fix extraction
|
||||
* [nbcnews] Fix extraction (#12569, #12576, #21703, #21923)
|
||||
- [hark] Remove extractor
|
||||
- [tutv] Remove extractor
|
||||
- [learnr] Remove extractor
|
||||
- [macgamestore] Remove extractor
|
||||
* [la7] Update Kaltura service URL (#22358)
|
||||
* [thesun] Fix extraction (#16966)
|
||||
- [makertv] Remove extractor
|
||||
+ [tenplay] Add support for 10play.com.au (#21446)
|
||||
* [soundcloud] Improve extraction
|
||||
* Improve format extraction (#22123)
|
||||
+ Extract uploader_id and uploader_url (#21916)
|
||||
+ Extract all known thumbnails (#19071, #20659)
|
||||
* Fix extration for private playlists (#20976)
|
||||
+ Add support for playlist embeds (#20976)
|
||||
* Skip preview formats (#22806)
|
||||
* [dplay] Improve extraction
|
||||
+ Add support for dplay.fi, dplay.jp and es.dplay.com (#16969)
|
||||
* Fix it.dplay.com extraction (#22826)
|
||||
+ Extract creator, tags and thumbnails
|
||||
* Handle playback API call errors
|
||||
+ [discoverynetworks] Add support for dplay.co.uk
|
||||
* [vk] Improve extraction
|
||||
+ Add support for Odnoklassniki embeds
|
||||
+ Extract more videos from user lists (#4470)
|
||||
+ Fix wall post audio extraction (#18332)
|
||||
* Improve error detection (#22568)
|
||||
+ [odnoklassniki] Add support for embeds
|
||||
* [puhutv] Improve extraction
|
||||
* Fix subtitles extraction
|
||||
* Transform HLS URLs to HTTP URLs
|
||||
* Improve metadata extraction
|
||||
* [ceskatelevize] Skip DRM media
|
||||
+ [facebook] Extract subtitles (#22777)
|
||||
* [globo] Handle alternative hash signing method
|
||||
|
||||
|
||||
version 2019.10.22
|
||||
|
||||
Core
|
||||
* [utils] Improve subtitles_filename (#22753)
|
||||
|
||||
Extractors
|
||||
* [facebook] Bypass download rate limits (#21018)
|
||||
+ [contv] Add support for contv.com
|
||||
- [viewster] Remove extractor
|
||||
* [xfileshare] Improve extractor (#17032, #17906, #18237, #18239)
|
||||
* Update the list of domains
|
||||
+ Add support for aa-encoded video data
|
||||
* Improve jwplayer format extraction
|
||||
+ Add support for Clappr sources
|
||||
* [mangomolo] Fix video format extraction and add support for player URLs
|
||||
* [audioboom] Improve metadata extraction
|
||||
* [twitch] Update VOD URL matching (#22395, #22727)
|
||||
- [mit] Remove support for video.mit.edu (#22403)
|
||||
- [servingsys] Remove extractor (#22639)
|
||||
* [dumpert] Fix extraction (#22428, #22564)
|
||||
* [atresplayer] Fix extraction (#16277, #16716)
|
||||
|
||||
|
||||
version 2019.10.16
|
||||
|
||||
Core
|
||||
* [extractor/common] Make _is_valid_url more relaxed
|
||||
|
||||
Extractors
|
||||
* [vimeo] Improve album videos id extraction (#22599)
|
||||
+ [globo] Extract subtitles (#22713)
|
||||
* [bokecc] Improve player params extraction (#22638)
|
||||
* [nexx] Handle result list (#22666)
|
||||
* [vimeo] Fix VHX embed extraction
|
||||
* [nbc] Switch to graphql API (#18581, #22693, #22701)
|
||||
- [vessel] Remove extractor
|
||||
- [promptfile] Remove extractor (#6239)
|
||||
* [kaltura] Fix service URL extraction (#22658)
|
||||
* [kaltura] Fix embed info strip (#22658)
|
||||
* [globo] Fix format extraction (#20319)
|
||||
* [redtube] Improve metadata extraction (#22492, #22615)
|
||||
* [pornhub:uservideos:upload] Fix extraction (#22619)
|
||||
+ [telequebec:squat] Add support for squat.telequebec.tv (#18503)
|
||||
- [wimp] Remove extractor (#22088, #22091)
|
||||
+ [gfycat] Extend URL regular expression (#22225)
|
||||
+ [chaturbate] Extend URL regular expression (#22309)
|
||||
* [peertube] Update instances (#22414)
|
||||
+ [telequebec] Add support for coucou.telequebec.tv (#22482)
|
||||
+ [xvideos] Extend URL regular expression (#22471)
|
||||
- [youtube] Remove support for invidious.enkirton.net (#22543)
|
||||
+ [openload] Add support for oload.monster (#22592)
|
||||
* [nrktv:seriebase] Fix extraction (#22596)
|
||||
+ [youtube] Add support for yt.lelux.fi (#22597)
|
||||
* [orf:tvthek] Make manifest requests non fatal (#22578)
|
||||
* [teachable] Skip login when already logged in (#22572)
|
||||
* [viewlift] Improve extraction (#22545)
|
||||
* [nonktube] Fix extraction (#22544)
|
||||
|
||||
|
||||
version 2019.09.28
|
||||
|
||||
Core
|
||||
* [YoutubeDL] Honour all --get-* options with --flat-playlist (#22493)
|
||||
|
||||
Extractors
|
||||
* [vk] Fix extraction (#22522)
|
||||
* [heise] Fix kaltura embeds extraction (#22514)
|
||||
* [ted] Check for resources validity and extract subtitled downloads (#22513)
|
||||
+ [youtube] Add support for
|
||||
owxfohz4kjyv25fvlqilyxast7inivgiktls3th44jhk3ej3i7ya.b32.i2p (#22292)
|
||||
+ [nhk] Add support for clips
|
||||
* [nhk] Fix video extraction (#22249, #22353)
|
||||
* [byutv] Fix extraction (#22070)
|
||||
+ [openload] Add support for oload.online (#22304)
|
||||
+ [youtube] Add support for invidious.drycat.fr (#22451)
|
||||
* [jwplatfom] Do not match video URLs (#20596, #22148)
|
||||
* [youtube:playlist] Unescape playlist uploader (#22483)
|
||||
+ [bilibili] Add support audio albums and songs (#21094)
|
||||
+ [instagram] Add support for tv URLs
|
||||
+ [mixcloud] Allow uppercase letters in format URLs (#19280)
|
||||
* [brightcove] Delegate all supported legacy URLs to new extractor (#11523,
|
||||
#12842, #13912, #15669, #16303)
|
||||
* [hotstar] Use native HLS downloader by default
|
||||
+ [hotstar] Extract more formats (#22323)
|
||||
* [9now] Fix extraction (#22361)
|
||||
* [zdf] Bypass geo restriction
|
||||
+ [tv4] Extract series metadata
|
||||
* [tv4] Fix extraction (#22443)
|
||||
|
||||
|
||||
version 2019.09.12.1
|
||||
|
||||
Extractors
|
||||
|
@ -752,8 +752,8 @@ As a last resort, you can also uninstall the version installed by your package m
|
||||
Afterwards, simply follow [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html):
|
||||
|
||||
```
|
||||
sudo wget https://yt-dl.org/latest/youtube-dl -O /usr/local/bin/youtube-dl
|
||||
sudo chmod a+x /usr/local/bin/youtube-dl
|
||||
sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl
|
||||
sudo chmod a+rx /usr/local/bin/youtube-dl
|
||||
hash -r
|
||||
```
|
||||
|
||||
|
@ -76,8 +76,6 @@
|
||||
- **awaan:video**
|
||||
- **AZMedien**: AZ Medien videos
|
||||
- **BaiduVideo**: 百度视频
|
||||
- **bambuser**
|
||||
- **bambuser:channel**
|
||||
- **Bandcamp**
|
||||
- **Bandcamp:album**
|
||||
- **Bandcamp:weekly**
|
||||
@ -98,6 +96,8 @@
|
||||
- **Bigflix**
|
||||
- **Bild**: Bild.de
|
||||
- **BiliBili**
|
||||
- **BilibiliAudio**
|
||||
- **BilibiliAudioAlbum**
|
||||
- **BioBioChileTV**
|
||||
- **BIQLE**
|
||||
- **BitChute**
|
||||
@ -181,6 +181,7 @@
|
||||
- **ComedyCentralShortname**
|
||||
- **ComedyCentralTV**
|
||||
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
|
||||
- **CONtv**
|
||||
- **Corus**
|
||||
- **Coub**
|
||||
- **Cracked**
|
||||
@ -229,7 +230,6 @@
|
||||
- **DouyuShow**
|
||||
- **DouyuTV**: 斗鱼
|
||||
- **DPlay**
|
||||
- **DPlayIt**
|
||||
- **DRBonanza**
|
||||
- **Dropbox**
|
||||
- **DrTuber**
|
||||
@ -282,12 +282,12 @@
|
||||
- **FiveThirtyEight**
|
||||
- **FiveTV**
|
||||
- **Flickr**
|
||||
- **Flipagram**
|
||||
- **Folketinget**: Folketinget (ft.dk; Danish parliament)
|
||||
- **FootyRoom**
|
||||
- **Formula1**
|
||||
- **FOX**
|
||||
- **FOX9**
|
||||
- **FOX9News**
|
||||
- **Foxgay**
|
||||
- **foxnews**: Fox News and Fox Business Video
|
||||
- **foxnews:article**
|
||||
@ -313,8 +313,6 @@
|
||||
- **FXNetworks**
|
||||
- **Gaia**
|
||||
- **GameInformer**
|
||||
- **GameOne**
|
||||
- **gameone:playlist**
|
||||
- **GameSpot**
|
||||
- **GameStar**
|
||||
- **Gaskrank**
|
||||
@ -329,14 +327,12 @@
|
||||
- **Globo**
|
||||
- **GloboArticle**
|
||||
- **Go**
|
||||
- **Go90**
|
||||
- **GodTube**
|
||||
- **Golem**
|
||||
- **GoogleDrive**
|
||||
- **Goshgay**
|
||||
- **GPUTechConf**
|
||||
- **Groupon**
|
||||
- **Hark**
|
||||
- **hbo**
|
||||
- **HearThisAt**
|
||||
- **Heise**
|
||||
@ -365,7 +361,6 @@
|
||||
- **Hungama**
|
||||
- **HungamaSong**
|
||||
- **Hypem**
|
||||
- **Iconosquare**
|
||||
- **ign.com**
|
||||
- **imdb**: Internet Movie Database trailers
|
||||
- **imdb:list**: Internet Movie Database lists
|
||||
@ -405,7 +400,6 @@
|
||||
- **Kankan**
|
||||
- **Karaoketv**
|
||||
- **KarriereVideos**
|
||||
- **keek**
|
||||
- **KeezMovies**
|
||||
- **Ketnet**
|
||||
- **KhanAcademy**
|
||||
@ -429,7 +423,6 @@
|
||||
- **Lcp**
|
||||
- **LcpPlay**
|
||||
- **Le**: 乐视网
|
||||
- **Learnr**
|
||||
- **Lecture2Go**
|
||||
- **Lecturio**
|
||||
- **LecturioCourse**
|
||||
@ -463,11 +456,9 @@
|
||||
- **lynda**: lynda.com videos
|
||||
- **lynda:course**: lynda.com online courses
|
||||
- **m6**
|
||||
- **macgamestore**: MacGameStore trailers
|
||||
- **mailru**: Видео@Mail.Ru
|
||||
- **mailru:music**: Музыка@Mail.Ru
|
||||
- **mailru:music:search**: Музыка@Mail.Ru
|
||||
- **MakerTV**
|
||||
- **MallTV**
|
||||
- **mangomolo:live**
|
||||
- **mangomolo:video**
|
||||
@ -523,8 +514,8 @@
|
||||
- **mtg**: MTG services
|
||||
- **mtv**
|
||||
- **mtv.de**
|
||||
- **mtv81**
|
||||
- **mtv:video**
|
||||
- **mtvjapan**
|
||||
- **mtvservices:embedded**
|
||||
- **MuenchenTV**: münchen.tv
|
||||
- **MusicPlayOn**
|
||||
@ -692,7 +683,6 @@
|
||||
- **PornoXO**
|
||||
- **PornTube**
|
||||
- **PressTV**
|
||||
- **PromptFile**
|
||||
- **prosiebensat1**: ProSiebenSat.1 Digital
|
||||
- **puhutv**
|
||||
- **puhutv:serie**
|
||||
@ -780,10 +770,11 @@
|
||||
- **Screencast**
|
||||
- **ScreencastOMatic**
|
||||
- **scrippsnetworks:watch**
|
||||
- **SCTE**
|
||||
- **SCTECourse**
|
||||
- **Seeker**
|
||||
- **SenateISVP**
|
||||
- **SendtoNews**
|
||||
- **ServingSys**
|
||||
- **Servus**
|
||||
- **Sexu**
|
||||
- **SeznamZpravy**
|
||||
@ -814,6 +805,7 @@
|
||||
- **soundcloud:set**
|
||||
- **soundcloud:trackstation**
|
||||
- **soundcloud:user**
|
||||
- **SoundcloudEmbed**
|
||||
- **soundgasm**
|
||||
- **soundgasm:profile**
|
||||
- **southpark.cc.com**
|
||||
@ -882,9 +874,11 @@
|
||||
- **TeleQuebec**
|
||||
- **TeleQuebecEmission**
|
||||
- **TeleQuebecLive**
|
||||
- **TeleQuebecSquat**
|
||||
- **TeleTask**
|
||||
- **Telewebion**
|
||||
- **TennisTV**
|
||||
- **TenPlay**
|
||||
- **TF1**
|
||||
- **TFO**
|
||||
- **TheIntercept**
|
||||
@ -923,11 +917,11 @@
|
||||
- **tunein:topic**
|
||||
- **TunePk**
|
||||
- **Turbo**
|
||||
- **Tutv**
|
||||
- **tv.dfb.de**
|
||||
- **TV2**
|
||||
- **tv2.hu**
|
||||
- **TV2Article**
|
||||
- **TV2DK**
|
||||
- **TV4**: tv4.se and tv4play.se
|
||||
- **TV5MondePlus**: TV5MONDE+
|
||||
- **TVA**
|
||||
@ -989,7 +983,6 @@
|
||||
- **VeeHD**
|
||||
- **Veoh**
|
||||
- **verystream**
|
||||
- **Vessel**
|
||||
- **Vesti**: Вести.Ru
|
||||
- **Vevo**
|
||||
- **VevoPlaylist**
|
||||
@ -1004,7 +997,6 @@
|
||||
- **Viddler**
|
||||
- **Videa**
|
||||
- **video.google:search**: Google Video search
|
||||
- **video.mit.edu**
|
||||
- **VideoDetective**
|
||||
- **videofy.me**
|
||||
- **videomore**
|
||||
@ -1022,7 +1014,6 @@
|
||||
- **vier:videos**
|
||||
- **ViewLift**
|
||||
- **ViewLiftEmbed**
|
||||
- **Viewster**
|
||||
- **Viidea**
|
||||
- **viki**
|
||||
- **viki:channel**
|
||||
@ -1088,7 +1079,6 @@
|
||||
- **Weibo**
|
||||
- **WeiboMobile**
|
||||
- **WeiqiTV**: WQTV
|
||||
- **Wimp**
|
||||
- **Wistia**
|
||||
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
|
||||
- **WorldStarHipHop**
|
||||
@ -1097,7 +1087,7 @@
|
||||
- **WWE**
|
||||
- **XBef**
|
||||
- **XboxClips**
|
||||
- **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo, RapidVideo.TV, FastVideo.me
|
||||
- **XFileShare**: XFileShare based sites: ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing
|
||||
- **XHamster**
|
||||
- **XHamsterEmbed**
|
||||
- **XHamsterUser**
|
||||
|
@ -123,12 +123,6 @@ class TestAllURLsMatching(unittest.TestCase):
|
||||
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs'])
|
||||
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs'])
|
||||
|
||||
def test_yahoo_https(self):
|
||||
# https://github.com/ytdl-org/youtube-dl/issues/2701
|
||||
self.assertMatch(
|
||||
'https://screen.yahoo.com/smartwatches-latest-wearable-gadgets-163745379-cbs.html',
|
||||
['Yahoo'])
|
||||
|
||||
def test_no_duplicated_ie_names(self):
|
||||
name_accu = collections.defaultdict(list)
|
||||
for ie in self.ies:
|
||||
|
@ -74,6 +74,7 @@ from youtube_dl.utils import (
|
||||
str_to_int,
|
||||
strip_jsonp,
|
||||
strip_or_none,
|
||||
subtitles_filename,
|
||||
timeconvert,
|
||||
unescapeHTML,
|
||||
unified_strdate,
|
||||
@ -261,6 +262,11 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp')
|
||||
self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp')
|
||||
|
||||
def test_subtitles_filename(self):
|
||||
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt'), 'abc.en.vtt')
|
||||
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt', 'ext'), 'abc.en.vtt')
|
||||
self.assertEqual(subtitles_filename('abc.unexpected_ext', 'en', 'vtt', 'ext'), 'abc.unexpected_ext.en.vtt')
|
||||
|
||||
def test_remove_start(self):
|
||||
self.assertEqual(remove_start(None, 'A - '), None)
|
||||
self.assertEqual(remove_start('A - B', 'A - '), 'B')
|
||||
|
@ -852,8 +852,9 @@ class YoutubeDL(object):
|
||||
extract_flat = self.params.get('extract_flat', False)
|
||||
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info)
|
||||
or extract_flat is True):
|
||||
if self.params.get('forcejson', False):
|
||||
self.to_stdout(json.dumps(ie_result))
|
||||
self.__forced_printings(
|
||||
ie_result, self.prepare_filename(ie_result),
|
||||
incomplete=True)
|
||||
return ie_result
|
||||
|
||||
if result_type == 'video':
|
||||
@ -1693,6 +1694,36 @@ class YoutubeDL(object):
|
||||
subs[lang] = f
|
||||
return subs
|
||||
|
||||
def __forced_printings(self, info_dict, filename, incomplete):
|
||||
def print_mandatory(field):
|
||||
if (self.params.get('force%s' % field, False)
|
||||
and (not incomplete or info_dict.get(field) is not None)):
|
||||
self.to_stdout(info_dict[field])
|
||||
|
||||
def print_optional(field):
|
||||
if (self.params.get('force%s' % field, False)
|
||||
and info_dict.get(field) is not None):
|
||||
self.to_stdout(info_dict[field])
|
||||
|
||||
print_mandatory('title')
|
||||
print_mandatory('id')
|
||||
if self.params.get('forceurl', False) and not incomplete:
|
||||
if info_dict.get('requested_formats') is not None:
|
||||
for f in info_dict['requested_formats']:
|
||||
self.to_stdout(f['url'] + f.get('play_path', ''))
|
||||
else:
|
||||
# For RTMP URLs, also include the playpath
|
||||
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
|
||||
print_optional('thumbnail')
|
||||
print_optional('description')
|
||||
if self.params.get('forcefilename', False) and filename is not None:
|
||||
self.to_stdout(filename)
|
||||
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
|
||||
self.to_stdout(formatSeconds(info_dict['duration']))
|
||||
print_mandatory('format')
|
||||
if self.params.get('forcejson', False):
|
||||
self.to_stdout(json.dumps(info_dict))
|
||||
|
||||
def process_info(self, info_dict):
|
||||
"""Process a single resolved IE result."""
|
||||
|
||||
@ -1703,9 +1734,8 @@ class YoutubeDL(object):
|
||||
if self._num_downloads >= int(max_downloads):
|
||||
raise MaxDownloadsReached()
|
||||
|
||||
# TODO: backward compatibility, to be removed
|
||||
info_dict['fulltitle'] = info_dict['title']
|
||||
if len(info_dict['title']) > 200:
|
||||
info_dict['title'] = info_dict['title'][:197] + '...'
|
||||
|
||||
if 'format' not in info_dict:
|
||||
info_dict['format'] = info_dict['ext']
|
||||
@ -1720,29 +1750,7 @@ class YoutubeDL(object):
|
||||
info_dict['_filename'] = filename = self.prepare_filename(info_dict)
|
||||
|
||||
# Forced printings
|
||||
if self.params.get('forcetitle', False):
|
||||
self.to_stdout(info_dict['fulltitle'])
|
||||
if self.params.get('forceid', False):
|
||||
self.to_stdout(info_dict['id'])
|
||||
if self.params.get('forceurl', False):
|
||||
if info_dict.get('requested_formats') is not None:
|
||||
for f in info_dict['requested_formats']:
|
||||
self.to_stdout(f['url'] + f.get('play_path', ''))
|
||||
else:
|
||||
# For RTMP URLs, also include the playpath
|
||||
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
|
||||
if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
|
||||
self.to_stdout(info_dict['thumbnail'])
|
||||
if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
|
||||
self.to_stdout(info_dict['description'])
|
||||
if self.params.get('forcefilename', False) and filename is not None:
|
||||
self.to_stdout(filename)
|
||||
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
|
||||
self.to_stdout(formatSeconds(info_dict['duration']))
|
||||
if self.params.get('forceformat', False):
|
||||
self.to_stdout(info_dict['format'])
|
||||
if self.params.get('forcejson', False):
|
||||
self.to_stdout(json.dumps(info_dict))
|
||||
self.__forced_printings(info_dict, filename, incomplete=False)
|
||||
|
||||
# Do nothing else if in simulate mode
|
||||
if self.params.get('simulate', False):
|
||||
@ -1806,7 +1814,7 @@ class YoutubeDL(object):
|
||||
ie = self.get_info_extractor(info_dict['extractor_key'])
|
||||
for sub_lang, sub_info in subtitles.items():
|
||||
sub_format = sub_info['ext']
|
||||
sub_filename = subtitles_filename(filename, sub_lang, sub_format)
|
||||
sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext'))
|
||||
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
|
||||
self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
|
||||
else:
|
||||
|
@ -1,95 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_HTTPError,
|
||||
compat_str,
|
||||
compat_urllib_parse_urlencode,
|
||||
compat_urllib_parse_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
qualities,
|
||||
)
|
||||
|
||||
|
||||
class AddAnimeIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
|
||||
'md5': '72954ea10bc979ab5e2eb288b21425a0',
|
||||
'info_dict': {
|
||||
'id': '24MR3YO5SAS9',
|
||||
'ext': 'mp4',
|
||||
'description': 'One Piece 606',
|
||||
'title': 'One Piece 606',
|
||||
},
|
||||
'skip': 'Video is gone',
|
||||
}, {
|
||||
'url': 'http://add-anime.net/video/MDUGWYKNGBD8/One-Piece-687',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
try:
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
except ExtractorError as ee:
|
||||
if not isinstance(ee.cause, compat_HTTPError) or \
|
||||
ee.cause.code != 503:
|
||||
raise
|
||||
|
||||
redir_webpage = ee.cause.read().decode('utf-8')
|
||||
action = self._search_regex(
|
||||
r'<form id="challenge-form" action="([^"]+)"',
|
||||
redir_webpage, 'Redirect form')
|
||||
vc = self._search_regex(
|
||||
r'<input type="hidden" name="jschl_vc" value="([^"]+)"/>',
|
||||
redir_webpage, 'redirect vc value')
|
||||
av = re.search(
|
||||
r'a\.value = ([0-9]+)[+]([0-9]+)[*]([0-9]+);',
|
||||
redir_webpage)
|
||||
if av is None:
|
||||
raise ExtractorError('Cannot find redirect math task')
|
||||
av_res = int(av.group(1)) + int(av.group(2)) * int(av.group(3))
|
||||
|
||||
parsed_url = compat_urllib_parse_urlparse(url)
|
||||
av_val = av_res + len(parsed_url.netloc)
|
||||
confirm_url = (
|
||||
parsed_url.scheme + '://' + parsed_url.netloc
|
||||
+ action + '?'
|
||||
+ compat_urllib_parse_urlencode({
|
||||
'jschl_vc': vc, 'jschl_answer': compat_str(av_val)}))
|
||||
self._download_webpage(
|
||||
confirm_url, video_id,
|
||||
note='Confirming after redirect')
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
FORMATS = ('normal', 'hq')
|
||||
quality = qualities(FORMATS)
|
||||
formats = []
|
||||
for format_id in FORMATS:
|
||||
rex = r"var %s_video_file = '(.*?)';" % re.escape(format_id)
|
||||
video_url = self._search_regex(rex, webpage, 'video file URLx',
|
||||
fatal=False)
|
||||
if not video_url:
|
||||
continue
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'url': video_url,
|
||||
'quality': quality(format_id),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
video_title = self._og_search_title(webpage)
|
||||
video_description = self._og_search_description(webpage)
|
||||
|
||||
return {
|
||||
'_type': 'video',
|
||||
'id': video_id,
|
||||
'formats': formats,
|
||||
'title': video_title,
|
||||
'description': video_description
|
||||
}
|
@ -1,202 +1,118 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import time
|
||||
import hmac
|
||||
import hashlib
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..compat import compat_HTTPError
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
sanitized_Request,
|
||||
urlencode_postdata,
|
||||
xpath_text,
|
||||
)
|
||||
|
||||
|
||||
class AtresPlayerIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})'
|
||||
_NETRC_MACHINE = 'atresplayer'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html',
|
||||
'md5': 'efd56753cda1bb64df52a3074f62e38a',
|
||||
'url': 'https://www.atresplayer.com/antena3/series/pequenas-coincidencias/temporada-1/capitulo-7-asuntos-pendientes_5d4aa2c57ed1a88fc715a615/',
|
||||
'info_dict': {
|
||||
'id': 'capitulo-10-especial-solidario-nochebuena',
|
||||
'id': '5d4aa2c57ed1a88fc715a615',
|
||||
'ext': 'mp4',
|
||||
'title': 'Especial Solidario de Nochebuena',
|
||||
'description': 'md5:e2d52ff12214fa937107d21064075bf1',
|
||||
'duration': 5527.6,
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'title': 'Capítulo 7: Asuntos pendientes',
|
||||
'description': 'md5:7634cdcb4d50d5381bedf93efb537fbc',
|
||||
'duration': 3413,
|
||||
},
|
||||
'params': {
|
||||
'format': 'bestvideo',
|
||||
},
|
||||
'skip': 'This video is only available for registered users'
|
||||
},
|
||||
{
|
||||
'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html',
|
||||
'md5': '6e52cbb513c405e403dbacb7aacf8747',
|
||||
'info_dict': {
|
||||
'id': 'capitulo-112-david-bustamante',
|
||||
'ext': 'flv',
|
||||
'title': 'David Bustamante',
|
||||
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
|
||||
'duration': 1439.0,
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.atresplayer.com/television/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_2014122400174.html',
|
||||
'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/',
|
||||
'only_matching': True,
|
||||
},
|
||||
]
|
||||
|
||||
_USER_AGENT = 'Dalvik/1.6.0 (Linux; U; Android 4.3; GT-I9300 Build/JSS15J'
|
||||
_MAGIC = 'QWtMLXs414Yo+c#_+Q#K@NN)'
|
||||
_TIMESTAMP_SHIFT = 30000
|
||||
|
||||
_TIME_API_URL = 'http://servicios.atresplayer.com/api/admin/time.json'
|
||||
_URL_VIDEO_TEMPLATE = 'https://servicios.atresplayer.com/api/urlVideo/{1}/{0}/{1}|{2}|{3}.json'
|
||||
_PLAYER_URL_TEMPLATE = 'https://servicios.atresplayer.com/episode/getplayer.json?episodePk=%s'
|
||||
_EPISODE_URL_TEMPLATE = 'http://www.atresplayer.com/episodexml/%s'
|
||||
|
||||
_LOGIN_URL = 'https://servicios.atresplayer.com/j_spring_security_check'
|
||||
|
||||
_ERRORS = {
|
||||
'UNPUBLISHED': 'We\'re sorry, but this video is not yet available.',
|
||||
'DELETED': 'This video has expired and is no longer available for online streaming.',
|
||||
'GEOUNPUBLISHED': 'We\'re sorry, but this video is not available in your region due to right restrictions.',
|
||||
# 'PREMIUM': 'PREMIUM',
|
||||
}
|
||||
_API_BASE = 'https://api.atresplayer.com/'
|
||||
|
||||
def _real_initialize(self):
|
||||
self._login()
|
||||
|
||||
def _handle_error(self, e, code):
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == code:
|
||||
error = self._parse_json(e.cause.read(), None)
|
||||
if error.get('error') == 'required_registered':
|
||||
self.raise_login_required()
|
||||
raise ExtractorError(error['error_description'], expected=True)
|
||||
raise
|
||||
|
||||
def _login(self):
|
||||
username, password = self._get_login_info()
|
||||
if username is None:
|
||||
return
|
||||
|
||||
login_form = {
|
||||
'j_username': username,
|
||||
'j_password': password,
|
||||
}
|
||||
self._request_webpage(
|
||||
self._API_BASE + 'login', None, 'Downloading login page')
|
||||
|
||||
request = sanitized_Request(
|
||||
self._LOGIN_URL, urlencode_postdata(login_form))
|
||||
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
|
||||
response = self._download_webpage(
|
||||
request, None, 'Logging in')
|
||||
try:
|
||||
target_url = self._download_json(
|
||||
'https://account.atresmedia.com/api/login', None,
|
||||
'Logging in', headers={
|
||||
'Content-Type': 'application/x-www-form-urlencoded'
|
||||
}, data=urlencode_postdata({
|
||||
'username': username,
|
||||
'password': password,
|
||||
}))['targetUrl']
|
||||
except ExtractorError as e:
|
||||
self._handle_error(e, 400)
|
||||
|
||||
error = self._html_search_regex(
|
||||
r'(?s)<ul[^>]+class="[^"]*\blist_error\b[^"]*">(.+?)</ul>',
|
||||
response, 'error', default=None)
|
||||
if error:
|
||||
raise ExtractorError(
|
||||
'Unable to login: %s' % error, expected=True)
|
||||
self._request_webpage(target_url, None, 'Following Target URL')
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
display_id, video_id = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
try:
|
||||
episode = self._download_json(
|
||||
self._API_BASE + 'client/v1/player/episode/' + video_id, video_id)
|
||||
except ExtractorError as e:
|
||||
self._handle_error(e, 403)
|
||||
|
||||
episode_id = self._search_regex(
|
||||
r'episode="([^"]+)"', webpage, 'episode id')
|
||||
|
||||
request = sanitized_Request(
|
||||
self._PLAYER_URL_TEMPLATE % episode_id,
|
||||
headers={'User-Agent': self._USER_AGENT})
|
||||
player = self._download_json(request, episode_id, 'Downloading player JSON')
|
||||
|
||||
episode_type = player.get('typeOfEpisode')
|
||||
error_message = self._ERRORS.get(episode_type)
|
||||
if error_message:
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
|
||||
title = episode['titulo']
|
||||
|
||||
formats = []
|
||||
video_url = player.get('urlVideo')
|
||||
if video_url:
|
||||
format_info = {
|
||||
'url': video_url,
|
||||
'format_id': 'http',
|
||||
}
|
||||
mobj = re.search(r'(?P<bitrate>\d+)K_(?P<width>\d+)x(?P<height>\d+)', video_url)
|
||||
if mobj:
|
||||
format_info.update({
|
||||
'width': int_or_none(mobj.group('width')),
|
||||
'height': int_or_none(mobj.group('height')),
|
||||
'tbr': int_or_none(mobj.group('bitrate')),
|
||||
})
|
||||
formats.append(format_info)
|
||||
|
||||
timestamp = int_or_none(self._download_webpage(
|
||||
self._TIME_API_URL,
|
||||
video_id, 'Downloading timestamp', fatal=False), 1000, time.time())
|
||||
timestamp_shifted = compat_str(timestamp + self._TIMESTAMP_SHIFT)
|
||||
token = hmac.new(
|
||||
self._MAGIC.encode('ascii'),
|
||||
(episode_id + timestamp_shifted).encode('utf-8'), hashlib.md5
|
||||
).hexdigest()
|
||||
|
||||
request = sanitized_Request(
|
||||
self._URL_VIDEO_TEMPLATE.format('windows', episode_id, timestamp_shifted, token),
|
||||
headers={'User-Agent': self._USER_AGENT})
|
||||
|
||||
fmt_json = self._download_json(
|
||||
request, video_id, 'Downloading windows video JSON')
|
||||
|
||||
result = fmt_json.get('resultDes')
|
||||
if result.lower() != 'ok':
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
|
||||
|
||||
for format_id, video_url in fmt_json['resultObject'].items():
|
||||
if format_id == 'token' or not video_url.startswith('http'):
|
||||
for source in episode.get('sources', []):
|
||||
src = source.get('src')
|
||||
if not src:
|
||||
continue
|
||||
if 'geodeswowsmpra3player' in video_url:
|
||||
# f4m_path = video_url.split('smil:', 1)[-1].split('free_', 1)[0]
|
||||
# f4m_url = 'http://drg.antena3.com/{0}hds/es/sd.f4m'.format(f4m_path)
|
||||
# this videos are protected by DRM, the f4m downloader doesn't support them
|
||||
continue
|
||||
video_url_hd = video_url.replace('free_es', 'es')
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
video_url_hd[:-9] + '/manifest.f4m', video_id, f4m_id='hds',
|
||||
fatal=False))
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
video_url_hd[:-9] + '/manifest.mpd', video_id, mpd_id='dash',
|
||||
fatal=False))
|
||||
src_type = source.get('type')
|
||||
if src_type == 'application/vnd.apple.mpegurl':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
src, video_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif src_type == 'application/dash+xml':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
src, video_id, mpd_id='dash', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
|
||||
path_data = player.get('pathData')
|
||||
|
||||
episode = self._download_xml(
|
||||
self._EPISODE_URL_TEMPLATE % path_data, video_id,
|
||||
'Downloading episode XML')
|
||||
|
||||
duration = float_or_none(xpath_text(
|
||||
episode, './media/asset/info/technical/contentDuration', 'duration'))
|
||||
|
||||
art = episode.find('./media/asset/info/art')
|
||||
title = xpath_text(art, './name', 'title')
|
||||
description = xpath_text(art, './description', 'description')
|
||||
thumbnail = xpath_text(episode, './media/asset/files/background', 'thumbnail')
|
||||
|
||||
subtitles = {}
|
||||
subtitle_url = xpath_text(episode, './media/asset/files/subtitle', 'subtitle')
|
||||
if subtitle_url:
|
||||
subtitles['es'] = [{
|
||||
'ext': 'srt',
|
||||
'url': subtitle_url,
|
||||
}]
|
||||
heartbeat = episode.get('heartbeat') or {}
|
||||
omniture = episode.get('omniture') or {}
|
||||
get_meta = lambda x: heartbeat.get(x) or omniture.get(x)
|
||||
|
||||
return {
|
||||
'display_id': display_id,
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'description': episode.get('descripcion'),
|
||||
'thumbnail': episode.get('imgPoster'),
|
||||
'duration': int_or_none(episode.get('duration')),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'channel': get_meta('channel'),
|
||||
'season': get_meta('season'),
|
||||
'episode_number': int_or_none(get_meta('episodeNumber')),
|
||||
}
|
||||
|
@ -2,22 +2,25 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import float_or_none
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
float_or_none,
|
||||
)
|
||||
|
||||
|
||||
class AudioBoomIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://audioboom.com/boos/4279833-3-09-2016-czaban-hour-3?t=0',
|
||||
'md5': '63a8d73a055c6ed0f1e51921a10a5a76',
|
||||
'url': 'https://audioboom.com/posts/7398103-asim-chaudhry',
|
||||
'md5': '7b00192e593ff227e6a315486979a42d',
|
||||
'info_dict': {
|
||||
'id': '4279833',
|
||||
'id': '7398103',
|
||||
'ext': 'mp3',
|
||||
'title': '3/09/2016 Czaban Hour 3',
|
||||
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
|
||||
'duration': 2245.72,
|
||||
'uploader': 'SB Nation A.M.',
|
||||
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
|
||||
'title': 'Asim Chaudhry',
|
||||
'description': 'md5:2f3fef17dacc2595b5362e1d7d3602fc',
|
||||
'duration': 4000.99,
|
||||
'uploader': 'Sue Perkins: An hour or so with...',
|
||||
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/perkins',
|
||||
}
|
||||
}, {
|
||||
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
|
||||
@ -32,8 +35,8 @@ class AudioBoomIE(InfoExtractor):
|
||||
clip = None
|
||||
|
||||
clip_store = self._parse_json(
|
||||
self._search_regex(
|
||||
r'data-new-clip-store=(["\'])(?P<json>{.*?"clipId"\s*:\s*%s.*?})\1' % video_id,
|
||||
self._html_search_regex(
|
||||
r'data-new-clip-store=(["\'])(?P<json>{.+?})\1',
|
||||
webpage, 'clip store', default='{}', group='json'),
|
||||
video_id, fatal=False)
|
||||
if clip_store:
|
||||
@ -47,14 +50,15 @@ class AudioBoomIE(InfoExtractor):
|
||||
|
||||
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
|
||||
'audio', webpage, 'audio url')
|
||||
title = from_clip('title') or self._og_search_title(webpage)
|
||||
description = from_clip('description') or self._og_search_description(webpage)
|
||||
title = from_clip('title') or self._html_search_meta(
|
||||
['og:title', 'og:audio:title', 'audio_title'], webpage)
|
||||
description = from_clip('description') or clean_html(from_clip('formattedDescription')) or self._og_search_description(webpage)
|
||||
|
||||
duration = float_or_none(from_clip('duration') or self._html_search_meta(
|
||||
'weibo:audio:duration', webpage))
|
||||
|
||||
uploader = from_clip('author') or self._og_search_property(
|
||||
'audio:artist', webpage, 'uploader', fatal=False)
|
||||
uploader = from_clip('author') or self._html_search_meta(
|
||||
['og:audio:artist', 'twitter:audio:artist_name', 'audio_artist'], webpage, 'uploader')
|
||||
uploader_url = from_clip('author_url') or self._html_search_meta(
|
||||
'audioboo:channel', webpage, 'uploader url')
|
||||
|
||||
|
@ -1,142 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import itertools
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
sanitized_Request,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class BambuserIE(InfoExtractor):
|
||||
IE_NAME = 'bambuser'
|
||||
_VALID_URL = r'https?://bambuser\.com/v/(?P<id>\d+)'
|
||||
_API_KEY = '005f64509e19a868399060af746a00aa'
|
||||
_LOGIN_URL = 'https://bambuser.com/user'
|
||||
_NETRC_MACHINE = 'bambuser'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://bambuser.com/v/4050584',
|
||||
# MD5 seems to be flaky, see https://travis-ci.org/ytdl-org/youtube-dl/jobs/14051016#L388
|
||||
# 'md5': 'fba8f7693e48fd4e8641b3fd5539a641',
|
||||
'info_dict': {
|
||||
'id': '4050584',
|
||||
'ext': 'flv',
|
||||
'title': 'Education engineering days - lightning talks',
|
||||
'duration': 3741,
|
||||
'uploader': 'pixelversity',
|
||||
'uploader_id': '344706',
|
||||
'timestamp': 1382976692,
|
||||
'upload_date': '20131028',
|
||||
'view_count': int,
|
||||
},
|
||||
'params': {
|
||||
# It doesn't respect the 'Range' header, it would download the whole video
|
||||
# caused the travis builds to fail: https://travis-ci.org/ytdl-org/youtube-dl/jobs/14493845#L59
|
||||
'skip_download': True,
|
||||
},
|
||||
}
|
||||
|
||||
def _login(self):
|
||||
username, password = self._get_login_info()
|
||||
if username is None:
|
||||
return
|
||||
|
||||
login_form = {
|
||||
'form_id': 'user_login',
|
||||
'op': 'Log in',
|
||||
'name': username,
|
||||
'pass': password,
|
||||
}
|
||||
|
||||
request = sanitized_Request(
|
||||
self._LOGIN_URL, urlencode_postdata(login_form))
|
||||
request.add_header('Referer', self._LOGIN_URL)
|
||||
response = self._download_webpage(
|
||||
request, None, 'Logging in')
|
||||
|
||||
login_error = self._html_search_regex(
|
||||
r'(?s)<div class="messages error">(.+?)</div>',
|
||||
response, 'login error', default=None)
|
||||
if login_error:
|
||||
raise ExtractorError(
|
||||
'Unable to login: %s' % login_error, expected=True)
|
||||
|
||||
def _real_initialize(self):
|
||||
self._login()
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
info = self._download_json(
|
||||
'http://player-c.api.bambuser.com/getVideo.json?api_key=%s&vid=%s'
|
||||
% (self._API_KEY, video_id), video_id)
|
||||
|
||||
error = info.get('error')
|
||||
if error:
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, error), expected=True)
|
||||
|
||||
result = info['result']
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': result['title'],
|
||||
'url': result['url'],
|
||||
'thumbnail': result.get('preview'),
|
||||
'duration': int_or_none(result.get('length')),
|
||||
'uploader': result.get('username'),
|
||||
'uploader_id': compat_str(result.get('owner', {}).get('uid')),
|
||||
'timestamp': int_or_none(result.get('created')),
|
||||
'fps': float_or_none(result.get('framerate')),
|
||||
'view_count': int_or_none(result.get('views_total')),
|
||||
'comment_count': int_or_none(result.get('comment_count')),
|
||||
}
|
||||
|
||||
|
||||
class BambuserChannelIE(InfoExtractor):
|
||||
IE_NAME = 'bambuser:channel'
|
||||
_VALID_URL = r'https?://bambuser\.com/channel/(?P<user>.*?)(?:/|#|\?|$)'
|
||||
# The maximum number we can get with each request
|
||||
_STEP = 50
|
||||
_TEST = {
|
||||
'url': 'http://bambuser.com/channel/pixelversity',
|
||||
'info_dict': {
|
||||
'title': 'pixelversity',
|
||||
},
|
||||
'playlist_mincount': 60,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
user = mobj.group('user')
|
||||
urls = []
|
||||
last_id = ''
|
||||
for i in itertools.count(1):
|
||||
req_url = (
|
||||
'http://bambuser.com/xhr-api/index.php?username={user}'
|
||||
'&sort=created&access_mode=0%2C1%2C2&limit={count}'
|
||||
'&method=broadcast&format=json&vid_older_than={last}'
|
||||
).format(user=user, count=self._STEP, last=last_id)
|
||||
req = sanitized_Request(req_url)
|
||||
# Without setting this header, we wouldn't get any result
|
||||
req.add_header('Referer', 'http://bambuser.com/channel/%s' % user)
|
||||
data = self._download_json(
|
||||
req, user, 'Downloading page %d' % i)
|
||||
results = data['result']
|
||||
if not results:
|
||||
break
|
||||
last_id = results[-1]['vid']
|
||||
urls.extend(self.url_result(v['page'], 'Bambuser') for v in results)
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'title': user,
|
||||
'entries': urls,
|
||||
}
|
@ -22,7 +22,8 @@ class BellMediaIE(InfoExtractor):
|
||||
bravo|
|
||||
mtv|
|
||||
space|
|
||||
etalk
|
||||
etalk|
|
||||
marilyn
|
||||
)\.ca|
|
||||
much\.com
|
||||
)/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
|
||||
@ -70,6 +71,7 @@ class BellMediaIE(InfoExtractor):
|
||||
'animalplanet': 'aniplan',
|
||||
'etalk': 'ctv',
|
||||
'bnnbloomberg': 'bnn',
|
||||
'marilyn': 'ctv_marilyn',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@ -15,6 +15,7 @@ from ..utils import (
|
||||
float_or_none,
|
||||
parse_iso8601,
|
||||
smuggle_url,
|
||||
str_or_none,
|
||||
strip_jsonp,
|
||||
unified_timestamp,
|
||||
unsmuggle_url,
|
||||
@ -306,3 +307,115 @@ class BiliBiliBangumiIE(InfoExtractor):
|
||||
return self.playlist_result(
|
||||
entries, bangumi_id,
|
||||
season_info.get('bangumi_title'), season_info.get('evaluate'))
|
||||
|
||||
|
||||
class BilibiliAudioBaseIE(InfoExtractor):
|
||||
def _call_api(self, path, sid, query=None):
|
||||
if not query:
|
||||
query = {'sid': sid}
|
||||
return self._download_json(
|
||||
'https://www.bilibili.com/audio/music-service-c/web/' + path,
|
||||
sid, query=query)['data']
|
||||
|
||||
|
||||
class BilibiliAudioIE(BilibiliAudioBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/audio/au(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.bilibili.com/audio/au1003142',
|
||||
'md5': 'fec4987014ec94ef9e666d4d158ad03b',
|
||||
'info_dict': {
|
||||
'id': '1003142',
|
||||
'ext': 'm4a',
|
||||
'title': '【tsukimi】YELLOW / 神山羊',
|
||||
'artist': 'tsukimi',
|
||||
'comment_count': int,
|
||||
'description': 'YELLOW的mp3版!',
|
||||
'duration': 183,
|
||||
'subtitles': {
|
||||
'origin': [{
|
||||
'ext': 'lrc',
|
||||
}],
|
||||
},
|
||||
'thumbnail': r're:^https?://.+\.jpg',
|
||||
'timestamp': 1564836614,
|
||||
'upload_date': '20190803',
|
||||
'uploader': 'tsukimi-つきみぐー',
|
||||
'view_count': int,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
au_id = self._match_id(url)
|
||||
|
||||
play_data = self._call_api('url', au_id)
|
||||
formats = [{
|
||||
'url': play_data['cdns'][0],
|
||||
'filesize': int_or_none(play_data.get('size')),
|
||||
}]
|
||||
|
||||
song = self._call_api('song/info', au_id)
|
||||
title = song['title']
|
||||
statistic = song.get('statistic') or {}
|
||||
|
||||
subtitles = None
|
||||
lyric = song.get('lyric')
|
||||
if lyric:
|
||||
subtitles = {
|
||||
'origin': [{
|
||||
'url': lyric,
|
||||
}]
|
||||
}
|
||||
|
||||
return {
|
||||
'id': au_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'artist': song.get('author'),
|
||||
'comment_count': int_or_none(statistic.get('comment')),
|
||||
'description': song.get('intro'),
|
||||
'duration': int_or_none(song.get('duration')),
|
||||
'subtitles': subtitles,
|
||||
'thumbnail': song.get('cover'),
|
||||
'timestamp': int_or_none(song.get('passtime')),
|
||||
'uploader': song.get('uname'),
|
||||
'view_count': int_or_none(statistic.get('play')),
|
||||
}
|
||||
|
||||
|
||||
class BilibiliAudioAlbumIE(BilibiliAudioBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/audio/am(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.bilibili.com/audio/am10624',
|
||||
'info_dict': {
|
||||
'id': '10624',
|
||||
'title': '每日新曲推荐(每日11:00更新)',
|
||||
'description': '每天11:00更新,为你推送最新音乐',
|
||||
},
|
||||
'playlist_count': 19,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
am_id = self._match_id(url)
|
||||
|
||||
songs = self._call_api(
|
||||
'song/of-menu', am_id, {'sid': am_id, 'pn': 1, 'ps': 100})['data']
|
||||
|
||||
entries = []
|
||||
for song in songs:
|
||||
sid = str_or_none(song.get('id'))
|
||||
if not sid:
|
||||
continue
|
||||
entries.append(self.url_result(
|
||||
'https://www.bilibili.com/audio/au' + sid,
|
||||
BilibiliAudioIE.ie_key(), sid))
|
||||
|
||||
if entries:
|
||||
album_data = self._call_api('menu/info', am_id) or {}
|
||||
album_title = album_data.get('title')
|
||||
if album_title:
|
||||
for entry in entries:
|
||||
entry['album'] = album_title
|
||||
return self.playlist_result(
|
||||
entries, am_id, album_title, album_data.get('intro'))
|
||||
|
||||
return self.playlist_result(entries, am_id)
|
||||
|
@ -11,8 +11,8 @@ from ..utils import ExtractorError
|
||||
class BokeCCBaseIE(InfoExtractor):
|
||||
def _extract_bokecc_formats(self, webpage, video_id, format_id=None):
|
||||
player_params_str = self._html_search_regex(
|
||||
r'<(?:script|embed)[^>]+src="http://p\.bokecc\.com/player\?([^"]+)',
|
||||
webpage, 'player params')
|
||||
r'<(?:script|embed)[^>]+src=(?P<q>["\'])(?:https?:)?//p\.bokecc\.com/(?:player|flash/player\.swf)\?(?P<query>.+?)(?P=q)',
|
||||
webpage, 'player params', group='query')
|
||||
|
||||
player_params = compat_parse_qs(player_params_str)
|
||||
|
||||
@ -36,9 +36,9 @@ class BokeCCIE(BokeCCBaseIE):
|
||||
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://union.bokecc.com/playvideo.bo?vid=E44D40C15E65EA30&uid=CD0C5D3C8614B28B',
|
||||
'url': 'http://union.bokecc.com/playvideo.bo?vid=E0ABAE9D4F509B189C33DC5901307461&uid=FE644790DE9D154A',
|
||||
'info_dict': {
|
||||
'id': 'CD0C5D3C8614B28B_E44D40C15E65EA30',
|
||||
'id': 'FE644790DE9D154A_E0ABAE9D4F509B189C33DC5901307461',
|
||||
'ext': 'flv',
|
||||
'title': 'BokeCC Video',
|
||||
},
|
||||
|
@ -3,7 +3,12 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import parse_duration
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
merge_dicts,
|
||||
parse_duration,
|
||||
url_or_none,
|
||||
)
|
||||
|
||||
|
||||
class BYUtvIE(InfoExtractor):
|
||||
@ -51,7 +56,7 @@ class BYUtvIE(InfoExtractor):
|
||||
video_id = mobj.group('id')
|
||||
display_id = mobj.group('display_id') or video_id
|
||||
|
||||
info = self._download_json(
|
||||
video = self._download_json(
|
||||
'https://api.byutv.org/api3/catalog/getvideosforcontent',
|
||||
display_id, query={
|
||||
'contentid': video_id,
|
||||
@ -62,7 +67,7 @@ class BYUtvIE(InfoExtractor):
|
||||
'x-byutv-platformkey': 'xsaaw9c7y5',
|
||||
})
|
||||
|
||||
ep = info.get('ooyalaVOD')
|
||||
ep = video.get('ooyalaVOD')
|
||||
if ep:
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
@ -75,18 +80,38 @@ class BYUtvIE(InfoExtractor):
|
||||
'thumbnail': ep.get('imageThumbnail'),
|
||||
}
|
||||
|
||||
ep = info['dvr']
|
||||
title = ep['title']
|
||||
formats = self._extract_m3u8_formats(
|
||||
ep['videoUrl'], video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls')
|
||||
info = {}
|
||||
formats = []
|
||||
for format_id, ep in video.items():
|
||||
if not isinstance(ep, dict):
|
||||
continue
|
||||
video_url = url_or_none(ep.get('videoUrl'))
|
||||
if not video_url:
|
||||
continue
|
||||
ext = determine_ext(video_url)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
video_url, video_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'url': video_url,
|
||||
'format_id': format_id,
|
||||
})
|
||||
merge_dicts(info, {
|
||||
'title': ep.get('title'),
|
||||
'description': ep.get('description'),
|
||||
'thumbnail': ep.get('imageThumbnail'),
|
||||
'duration': parse_duration(ep.get('length')),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
return {
|
||||
|
||||
return merge_dicts(info, {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': ep.get('description'),
|
||||
'thumbnail': ep.get('imageThumbnail'),
|
||||
'duration': parse_duration(ep.get('length')),
|
||||
'title': display_id,
|
||||
'formats': formats,
|
||||
}
|
||||
})
|
||||
|
@ -147,6 +147,8 @@ class CeskaTelevizeIE(InfoExtractor):
|
||||
is_live = item.get('type') == 'LIVE'
|
||||
formats = []
|
||||
for format_id, stream_url in item.get('streamUrls', {}).items():
|
||||
if 'drmOnly=true' in stream_url:
|
||||
continue
|
||||
if 'playerType=flash' in stream_url:
|
||||
stream_formats = self._extract_m3u8_formats(
|
||||
stream_url, playlist_id, 'mp4', 'm3u8_native',
|
||||
|
@ -7,7 +7,7 @@ from ..utils import ExtractorError
|
||||
|
||||
|
||||
class ChaturbateIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.com/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.com/(?:fullvideo/?\?.*?\bb=)?(?P<id>[^/?&#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.chaturbate.com/siswet19/',
|
||||
'info_dict': {
|
||||
@ -21,6 +21,9 @@ class ChaturbateIE(InfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Room is offline',
|
||||
}, {
|
||||
'url': 'https://chaturbate.com/fullvideo/?b=caylin',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://en.chaturbate.com/siswet19/',
|
||||
'only_matching': True,
|
||||
@ -32,7 +35,8 @@ class ChaturbateIE(InfoExtractor):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, headers=self.geo_verification_headers())
|
||||
'https://chaturbate.com/%s/' % video_id, video_id,
|
||||
headers=self.geo_verification_headers())
|
||||
|
||||
m3u8_urls = []
|
||||
|
||||
|
@ -1,74 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
)
|
||||
|
||||
|
||||
class ComCarCoffIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)'
|
||||
_TESTS = [{
|
||||
'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/',
|
||||
'info_dict': {
|
||||
'id': '2494164',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20141127',
|
||||
'timestamp': 1417107600,
|
||||
'duration': 1232,
|
||||
'title': 'Happy Thanksgiving Miranda',
|
||||
'description': 'Jerry Seinfeld and his special guest Miranda Sings cruise around town in search of coffee, complaining and apologizing along the way.',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': 'requires ffmpeg',
|
||||
}
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
if not display_id:
|
||||
display_id = 'comediansincarsgettingcoffee.com'
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
full_data = self._parse_json(
|
||||
self._search_regex(
|
||||
r'window\.app\s*=\s*({.+?});\n', webpage, 'full data json'),
|
||||
display_id)['videoData']
|
||||
|
||||
display_id = full_data['activeVideo']['video']
|
||||
video_data = full_data.get('videos', {}).get(display_id) or full_data['singleshots'][display_id]
|
||||
|
||||
video_id = compat_str(video_data['mediaId'])
|
||||
title = video_data['title']
|
||||
formats = self._extract_m3u8_formats(
|
||||
video_data['mediaUrl'], video_id, 'mp4')
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnails = [{
|
||||
'url': video_data['images']['thumb'],
|
||||
}, {
|
||||
'url': video_data['images']['poster'],
|
||||
}]
|
||||
|
||||
timestamp = int_or_none(video_data.get('pubDateTime')) or parse_iso8601(
|
||||
video_data.get('pubDate'))
|
||||
duration = int_or_none(video_data.get('durationSeconds')) or parse_duration(
|
||||
video_data.get('duration'))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': video_data.get('description'),
|
||||
'timestamp': timestamp,
|
||||
'duration': duration,
|
||||
'thumbnails': thumbnails,
|
||||
'formats': formats,
|
||||
'season_number': int_or_none(video_data.get('season')),
|
||||
'episode_number': int_or_none(video_data.get('episode')),
|
||||
'webpage_url': 'http://comediansincarsgettingcoffee.com/%s' % (video_data.get('urlSlug', video_data.get('slug'))),
|
||||
}
|
@ -1424,12 +1424,10 @@ class InfoExtractor(object):
|
||||
try:
|
||||
self._request_webpage(url, video_id, 'Checking %s URL' % item, headers=headers)
|
||||
return True
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_urllib_error.URLError):
|
||||
self.to_screen(
|
||||
'%s: %s URL is invalid, skipping' % (video_id, item))
|
||||
return False
|
||||
raise
|
||||
except ExtractorError:
|
||||
self.to_screen(
|
||||
'%s: %s URL is invalid, skipping' % (video_id, item))
|
||||
return False
|
||||
|
||||
def http_scheme(self):
|
||||
""" Either "http:" or "https:", depending on the user's preferences """
|
||||
@ -1457,14 +1455,14 @@ class InfoExtractor(object):
|
||||
|
||||
def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None,
|
||||
transform_source=lambda s: fix_xml_ampersands(s).strip(),
|
||||
fatal=True, m3u8_id=None):
|
||||
fatal=True, m3u8_id=None, data=None, headers={}, query={}):
|
||||
manifest = self._download_xml(
|
||||
manifest_url, video_id, 'Downloading f4m manifest',
|
||||
'Unable to download f4m manifest',
|
||||
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests
|
||||
# (see https://github.com/ytdl-org/youtube-dl/issues/6215#issuecomment-121704244)
|
||||
transform_source=transform_source,
|
||||
fatal=fatal)
|
||||
fatal=fatal, data=data, headers=headers, query=query)
|
||||
|
||||
if manifest is False:
|
||||
return []
|
||||
@ -1588,12 +1586,13 @@ class InfoExtractor(object):
|
||||
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
|
||||
entry_protocol='m3u8', preference=None,
|
||||
m3u8_id=None, note=None, errnote=None,
|
||||
fatal=True, live=False):
|
||||
fatal=True, live=False, data=None, headers={},
|
||||
query={}):
|
||||
res = self._download_webpage_handle(
|
||||
m3u8_url, video_id,
|
||||
note=note or 'Downloading m3u8 information',
|
||||
errnote=errnote or 'Failed to download m3u8 information',
|
||||
fatal=fatal)
|
||||
fatal=fatal, data=data, headers=headers, query=query)
|
||||
|
||||
if res is False:
|
||||
return []
|
||||
@ -2011,12 +2010,12 @@ class InfoExtractor(object):
|
||||
})
|
||||
return entries
|
||||
|
||||
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}):
|
||||
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}, data=None, headers={}, query={}):
|
||||
res = self._download_xml_handle(
|
||||
mpd_url, video_id,
|
||||
note=note or 'Downloading MPD manifest',
|
||||
errnote=errnote or 'Failed to download MPD manifest',
|
||||
fatal=fatal)
|
||||
fatal=fatal, data=data, headers=headers, query=query)
|
||||
if res is False:
|
||||
return []
|
||||
mpd_doc, urlh = res
|
||||
@ -2319,12 +2318,12 @@ class InfoExtractor(object):
|
||||
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
|
||||
return formats
|
||||
|
||||
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True):
|
||||
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}):
|
||||
res = self._download_xml_handle(
|
||||
ism_url, video_id,
|
||||
note=note or 'Downloading ISM manifest',
|
||||
errnote=errnote or 'Failed to download ISM manifest',
|
||||
fatal=fatal)
|
||||
fatal=fatal, data=data, headers=headers, query=query)
|
||||
if res is False:
|
||||
return []
|
||||
ism_doc, urlh = res
|
||||
@ -2691,7 +2690,7 @@ class InfoExtractor(object):
|
||||
entry = {
|
||||
'id': this_video_id,
|
||||
'title': unescapeHTML(video_data['title'] if require_title else video_data.get('title')),
|
||||
'description': video_data.get('description'),
|
||||
'description': clean_html(video_data.get('description')),
|
||||
'thumbnail': urljoin(base_url, self._proto_relative_url(video_data.get('image'))),
|
||||
'timestamp': int_or_none(video_data.get('pubdate')),
|
||||
'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')),
|
||||
|
118
youtube_dl/extractor/contv.py
Normal file
118
youtube_dl/extractor/contv.py
Normal file
@ -0,0 +1,118 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
class CONtvIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?contv\.com/details-movie/(?P<id>[^/]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.contv.com/details-movie/CEG10022949/days-of-thrills-&-laughter',
|
||||
'info_dict': {
|
||||
'id': 'CEG10022949',
|
||||
'ext': 'mp4',
|
||||
'title': 'Days Of Thrills & Laughter',
|
||||
'description': 'md5:5d6b3d0b1829bb93eb72898c734802eb',
|
||||
'upload_date': '20180703',
|
||||
'timestamp': 1530634789.61,
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.contv.com/details-movie/CLIP-show_fotld_bts/fight-of-the-living-dead:-behind-the-scenes-bites',
|
||||
'info_dict': {
|
||||
'id': 'CLIP-show_fotld_bts',
|
||||
'title': 'Fight of the Living Dead: Behind the Scenes Bites',
|
||||
},
|
||||
'playlist_mincount': 7,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
details = self._download_json(
|
||||
'http://metax.contv.live.junctiontv.net/metax/2.5/details/' + video_id,
|
||||
video_id, query={'device': 'web'})
|
||||
|
||||
if details.get('type') == 'episodic':
|
||||
seasons = self._download_json(
|
||||
'http://metax.contv.live.junctiontv.net/metax/2.5/seriesfeed/json/' + video_id,
|
||||
video_id)
|
||||
entries = []
|
||||
for season in seasons:
|
||||
for episode in season.get('episodes', []):
|
||||
episode_id = episode.get('id')
|
||||
if not episode_id:
|
||||
continue
|
||||
entries.append(self.url_result(
|
||||
'https://www.contv.com/details-movie/' + episode_id,
|
||||
CONtvIE.ie_key(), episode_id))
|
||||
return self.playlist_result(entries, video_id, details.get('title'))
|
||||
|
||||
m_details = details['details']
|
||||
title = details['title']
|
||||
|
||||
formats = []
|
||||
|
||||
media_hls_url = m_details.get('media_hls_url')
|
||||
if media_hls_url:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
media_hls_url, video_id, 'mp4',
|
||||
m3u8_id='hls', fatal=False))
|
||||
|
||||
media_mp4_url = m_details.get('media_mp4_url')
|
||||
if media_mp4_url:
|
||||
formats.append({
|
||||
'format_id': 'http',
|
||||
'url': media_mp4_url,
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
captions = m_details.get('captions') or {}
|
||||
for caption_url in captions.values():
|
||||
subtitles.setdefault('en', []).append({
|
||||
'url': caption_url
|
||||
})
|
||||
|
||||
thumbnails = []
|
||||
for image in m_details.get('images', []):
|
||||
image_url = image.get('url')
|
||||
if not image_url:
|
||||
continue
|
||||
thumbnails.append({
|
||||
'url': image_url,
|
||||
'width': int_or_none(image.get('width')),
|
||||
'height': int_or_none(image.get('height')),
|
||||
})
|
||||
|
||||
description = None
|
||||
for p in ('large_', 'medium_', 'small_', ''):
|
||||
d = m_details.get(p + 'description')
|
||||
if d:
|
||||
description = d
|
||||
break
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'thumbnails': thumbnails,
|
||||
'description': description,
|
||||
'timestamp': float_or_none(details.get('metax_added_on'), 1000),
|
||||
'subtitles': subtitles,
|
||||
'duration': float_or_none(m_details.get('duration'), 1000),
|
||||
'view_count': int_or_none(details.get('num_watched')),
|
||||
'like_count': int_or_none(details.get('num_fav')),
|
||||
'categories': details.get('category'),
|
||||
'tags': details.get('tags'),
|
||||
'season_number': int_or_none(details.get('season')),
|
||||
'episode_number': int_or_none(details.get('episode')),
|
||||
'release_year': int_or_none(details.get('pub_year')),
|
||||
}
|
@ -1,154 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import json
|
||||
import random
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..aes import (
|
||||
aes_cbc_decrypt,
|
||||
aes_cbc_encrypt,
|
||||
)
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import (
|
||||
bytes_to_intlist,
|
||||
bytes_to_long,
|
||||
extract_attributes,
|
||||
ExtractorError,
|
||||
intlist_to_bytes,
|
||||
js_to_json,
|
||||
int_or_none,
|
||||
long_to_bytes,
|
||||
pkcs1pad,
|
||||
)
|
||||
|
||||
|
||||
class DaisukiMottoIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://motto\.daisuki\.net/framewatch/embed/[^/]+/(?P<id>[0-9a-zA-Z]{3})'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://motto.daisuki.net/framewatch/embed/embedDRAGONBALLSUPERUniverseSurvivalsaga/V2e/760/428',
|
||||
'info_dict': {
|
||||
'id': 'V2e',
|
||||
'ext': 'mp4',
|
||||
'title': '#117 SHOWDOWN OF LOVE! ANDROIDS VS UNIVERSE 2!!',
|
||||
'subtitles': {
|
||||
'mul': [{
|
||||
'ext': 'ttml',
|
||||
}],
|
||||
},
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # AES-encrypted HLS stream
|
||||
},
|
||||
}
|
||||
|
||||
# The public key in PEM format can be found in clientlibs_anime_watch.min.js
|
||||
_RSA_KEY = (0xc5524c25e8e14b366b3754940beeb6f96cb7e2feef0b932c7659a0c5c3bf173d602464c2df73d693b513ae06ff1be8f367529ab30bf969c5640522181f2a0c51ea546ae120d3d8d908595e4eff765b389cde080a1ef7f1bbfb07411cc568db73b7f521cedf270cbfbe0ddbc29b1ac9d0f2d8f4359098caffee6d07915020077d, 65537)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
flashvars = self._parse_json(self._search_regex(
|
||||
r'(?s)var\s+flashvars\s*=\s*({.+?});', webpage, 'flashvars'),
|
||||
video_id, transform_source=js_to_json)
|
||||
|
||||
iv = [0] * 16
|
||||
|
||||
data = {}
|
||||
for key in ('device_cd', 'mv_id', 'ss1_prm', 'ss2_prm', 'ss3_prm', 'ss_id'):
|
||||
data[key] = flashvars.get(key, '')
|
||||
|
||||
encrypted_rtn = None
|
||||
|
||||
# Some AES keys are rejected. Try it with different AES keys
|
||||
for idx in range(5):
|
||||
aes_key = [random.randint(0, 254) for _ in range(32)]
|
||||
padded_aeskey = intlist_to_bytes(pkcs1pad(aes_key, 128))
|
||||
|
||||
n, e = self._RSA_KEY
|
||||
encrypted_aeskey = long_to_bytes(pow(bytes_to_long(padded_aeskey), e, n))
|
||||
init_data = self._download_json(
|
||||
'http://motto.daisuki.net/fastAPI/bgn/init/',
|
||||
video_id, query={
|
||||
's': flashvars.get('s', ''),
|
||||
'c': flashvars.get('ss3_prm', ''),
|
||||
'e': url,
|
||||
'd': base64.b64encode(intlist_to_bytes(aes_cbc_encrypt(
|
||||
bytes_to_intlist(json.dumps(data)),
|
||||
aes_key, iv))).decode('ascii'),
|
||||
'a': base64.b64encode(encrypted_aeskey).decode('ascii'),
|
||||
}, note='Downloading JSON metadata' + (' (try #%d)' % (idx + 1) if idx > 0 else ''))
|
||||
|
||||
if 'rtn' in init_data:
|
||||
encrypted_rtn = init_data['rtn']
|
||||
break
|
||||
|
||||
self._sleep(5, video_id)
|
||||
|
||||
if encrypted_rtn is None:
|
||||
raise ExtractorError('Failed to fetch init data')
|
||||
|
||||
rtn = self._parse_json(
|
||||
intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(
|
||||
compat_b64decode(encrypted_rtn)),
|
||||
aes_key, iv)).decode('utf-8').rstrip('\0'),
|
||||
video_id)
|
||||
|
||||
title = rtn['title_str']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
rtn['play_url'], video_id, ext='mp4', entry_protocol='m3u8_native')
|
||||
|
||||
subtitles = {}
|
||||
caption_url = rtn.get('caption_url')
|
||||
if caption_url:
|
||||
# mul: multiple languages
|
||||
subtitles['mul'] = [{
|
||||
'url': caption_url,
|
||||
'ext': 'ttml',
|
||||
}]
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
|
||||
class DaisukiMottoPlaylistIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://motto\.daisuki\.net/(?P<id>information)/'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://motto.daisuki.net/information/',
|
||||
'info_dict': {
|
||||
'title': 'DRAGON BALL SUPER',
|
||||
},
|
||||
'playlist_mincount': 117,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
entries = []
|
||||
for li in re.findall(r'(<li[^>]+?data-product_id="[a-zA-Z0-9]{3}"[^>]+>)', webpage):
|
||||
attr = extract_attributes(li)
|
||||
ad_id = attr.get('data-ad_id')
|
||||
product_id = attr.get('data-product_id')
|
||||
if ad_id and product_id:
|
||||
episode_id = attr.get('data-chapter')
|
||||
entries.append({
|
||||
'_type': 'url_transparent',
|
||||
'url': 'http://motto.daisuki.net/framewatch/embed/%s/%s/760/428' % (ad_id, product_id),
|
||||
'episode_id': episode_id,
|
||||
'episode_number': int_or_none(episode_id),
|
||||
'ie_key': 'DaisukiMotto',
|
||||
})
|
||||
|
||||
return self.playlist_result(entries, playlist_title='DRAGON BALL SUPER')
|
@ -2,25 +2,21 @@
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import itertools
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urllib_parse_unquote,
|
||||
compat_urllib_parse_urlencode,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
str_to_int,
|
||||
xpath_text,
|
||||
unescapeHTML,
|
||||
)
|
||||
|
||||
|
||||
class DaumIE(InfoExtractor):
|
||||
class DaumBaseIE(InfoExtractor):
|
||||
_KAKAO_EMBED_BASE = 'http://tv.kakao.com/embed/player/cliplink/'
|
||||
|
||||
|
||||
class DaumIE(DaumBaseIE):
|
||||
_VALID_URL = r'https?://(?:(?:m\.)?tvpot\.daum\.net/v/|videofarm\.daum\.net/controller/player/VodPlayer\.swf\?vid=)(?P<id>[^?#&]+)'
|
||||
IE_NAME = 'daum.net'
|
||||
|
||||
@ -36,6 +32,9 @@ class DaumIE(InfoExtractor):
|
||||
'duration': 2117,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'uploader_id': 186139,
|
||||
'uploader': '콘간지',
|
||||
'timestamp': 1387310323,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://m.tvpot.daum.net/v/65139429',
|
||||
@ -44,11 +43,14 @@ class DaumIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': '1297회, \'아빠 아들로 태어나길 잘 했어\' 민수, 감동의 눈물[아빠 어디가] 20150118',
|
||||
'description': 'md5:79794514261164ff27e36a21ad229fc5',
|
||||
'upload_date': '20150604',
|
||||
'upload_date': '20150118',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'duration': 154,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'uploader': 'MBC 예능',
|
||||
'uploader_id': 132251,
|
||||
'timestamp': 1421604228,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://tvpot.daum.net/v/07dXWRka62Y%24',
|
||||
@ -59,12 +61,15 @@ class DaumIE(InfoExtractor):
|
||||
'id': 'vwIpVpCQsT8$',
|
||||
'ext': 'flv',
|
||||
'title': '01-Korean War ( Trouble on the horizon )',
|
||||
'description': '\nKorean War 01\nTrouble on the horizon\n전쟁의 먹구름',
|
||||
'description': 'Korean War 01\r\nTrouble on the horizon\r\n전쟁의 먹구름',
|
||||
'upload_date': '20080223',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'duration': 249,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
'uploader': '까칠한 墮落始祖 황비홍님의',
|
||||
'uploader_id': 560824,
|
||||
'timestamp': 1203770745,
|
||||
},
|
||||
}, {
|
||||
# Requires dte_type=WEB (#9972)
|
||||
@ -73,60 +78,24 @@ class DaumIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': 's3794Uf1NZeZ1qMpGpeqeRU',
|
||||
'ext': 'mp4',
|
||||
'title': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny) [쇼! 음악중심] 508회 20160611',
|
||||
'description': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)\n\n[쇼! 음악중심] 20160611, 507회',
|
||||
'upload_date': '20160611',
|
||||
'title': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)',
|
||||
'description': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)\r\n\r\n[쇼! 음악중심] 20160611, 507회',
|
||||
'upload_date': '20170129',
|
||||
'uploader': '쇼! 음악중심',
|
||||
'uploader_id': 2653210,
|
||||
'timestamp': 1485684628,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = compat_urllib_parse_unquote(self._match_id(url))
|
||||
movie_data = self._download_json(
|
||||
'http://videofarm.daum.net/controller/api/closed/v1_2/IntegratedMovieData.json',
|
||||
video_id, 'Downloading video formats info', query={'vid': video_id, 'dte_type': 'WEB'})
|
||||
|
||||
# For urls like http://m.tvpot.daum.net/v/65139429, where the video_id is really a clipid
|
||||
if not movie_data.get('output_list', {}).get('output_list') and re.match(r'^\d+$', video_id):
|
||||
return self.url_result('http://tvpot.daum.net/clip/ClipView.do?clipid=%s' % video_id)
|
||||
|
||||
info = self._download_xml(
|
||||
'http://tvpot.daum.net/clip/ClipInfoXml.do', video_id,
|
||||
'Downloading video info', query={'vid': video_id})
|
||||
|
||||
formats = []
|
||||
for format_el in movie_data['output_list']['output_list']:
|
||||
profile = format_el['profile']
|
||||
format_query = compat_urllib_parse_urlencode({
|
||||
'vid': video_id,
|
||||
'profile': profile,
|
||||
})
|
||||
url_doc = self._download_xml(
|
||||
'http://videofarm.daum.net/controller/api/open/v1_2/MovieLocation.apixml?' + format_query,
|
||||
video_id, note='Downloading video data for %s format' % profile)
|
||||
format_url = url_doc.find('result/url').text
|
||||
formats.append({
|
||||
'url': format_url,
|
||||
'format_id': profile,
|
||||
'width': int_or_none(format_el.get('width')),
|
||||
'height': int_or_none(format_el.get('height')),
|
||||
'filesize': int_or_none(format_el.get('filesize')),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': info.find('TITLE').text,
|
||||
'formats': formats,
|
||||
'thumbnail': xpath_text(info, 'THUMB_URL'),
|
||||
'description': xpath_text(info, 'CONTENTS'),
|
||||
'duration': int_or_none(xpath_text(info, 'DURATION')),
|
||||
'upload_date': info.find('REGDTTM').text[:8],
|
||||
'view_count': str_to_int(xpath_text(info, 'PLAY_CNT')),
|
||||
'comment_count': str_to_int(xpath_text(info, 'COMMENT_CNT')),
|
||||
}
|
||||
if not video_id.isdigit():
|
||||
video_id += '@my'
|
||||
return self.url_result(
|
||||
self._KAKAO_EMBED_BASE + video_id, 'Kakao', video_id)
|
||||
|
||||
|
||||
class DaumClipIE(InfoExtractor):
|
||||
class DaumClipIE(DaumBaseIE):
|
||||
_VALID_URL = r'https?://(?:m\.)?tvpot\.daum\.net/(?:clip/ClipView.(?:do|tv)|mypot/View.do)\?.*?clipid=(?P<id>\d+)'
|
||||
IE_NAME = 'daum.net:clip'
|
||||
_URL_TEMPLATE = 'http://tvpot.daum.net/clip/ClipView.do?clipid=%s'
|
||||
@ -142,6 +111,9 @@ class DaumClipIE(InfoExtractor):
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'duration': 3868,
|
||||
'view_count': int,
|
||||
'uploader': 'GOMeXP',
|
||||
'uploader_id': 6667,
|
||||
'timestamp': 1377911092,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://m.tvpot.daum.net/clip/ClipView.tv?clipid=54999425',
|
||||
@ -154,22 +126,8 @@ class DaumClipIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
clip_info = self._download_json(
|
||||
'http://tvpot.daum.net/mypot/json/GetClipInfo.do?clipid=%s' % video_id,
|
||||
video_id, 'Downloading clip info')['clip_bean']
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
'url': 'http://tvpot.daum.net/v/%s' % clip_info['vid'],
|
||||
'title': unescapeHTML(clip_info['title']),
|
||||
'thumbnail': clip_info.get('thumb_url'),
|
||||
'description': clip_info.get('contents'),
|
||||
'duration': int_or_none(clip_info.get('duration')),
|
||||
'upload_date': clip_info.get('up_date')[:8],
|
||||
'view_count': int_or_none(clip_info.get('play_count')),
|
||||
'ie_key': 'Daum',
|
||||
}
|
||||
return self.url_result(
|
||||
self._KAKAO_EMBED_BASE + video_id, 'Kakao', video_id)
|
||||
|
||||
|
||||
class DaumListIE(InfoExtractor):
|
||||
|
@ -3,63 +3,38 @@ from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .brightcove import BrightcoveLegacyIE
|
||||
from .dplay import DPlayIE
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import smuggle_url
|
||||
|
||||
|
||||
class DiscoveryNetworksDeIE(DPlayIE):
|
||||
_VALID_URL = r'''(?x)https?://(?:www\.)?(?P<site>discovery|tlc|animalplanet|dmax)\.de/
|
||||
(?:
|
||||
.*\#(?P<id>\d+)|
|
||||
(?:[^/]+/)*videos/(?P<display_id>[^/?#]+)|
|
||||
programme/(?P<programme>[^/]+)/video/(?P<alternate_id>[^/]+)
|
||||
)'''
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:tlc|dmax)\.de|dplay\.co\.uk)/(?:programme|show)/(?P<programme>[^/]+)/video/(?P<alternate_id>[^/]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.tlc.de/sendungen/breaking-amish/videos/#3235167922001',
|
||||
'url': 'https://www.tlc.de/programme/breaking-amish/video/die-welt-da-drauen/DCB331270001100',
|
||||
'info_dict': {
|
||||
'id': '3235167922001',
|
||||
'id': '78867',
|
||||
'ext': 'mp4',
|
||||
'title': 'Breaking Amish: Die Welt da draußen',
|
||||
'description': (
|
||||
'Vier Amische und eine Mennonitin wagen in New York'
|
||||
' den Sprung in ein komplett anderes Leben. Begleitet sie auf'
|
||||
' ihrem spannenden Weg.'),
|
||||
'timestamp': 1396598084,
|
||||
'upload_date': '20140404',
|
||||
'uploader_id': '1659832546',
|
||||
'title': 'Die Welt da draußen',
|
||||
'description': 'md5:61033c12b73286e409d99a41742ef608',
|
||||
'timestamp': 1554069600,
|
||||
'upload_date': '20190331',
|
||||
},
|
||||
'params': {
|
||||
'format': 'bestvideo',
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.dmax.de/programme/storage-hunters-uk/videos/storage-hunters-uk-episode-6/',
|
||||
'url': 'https://www.dmax.de/programme/dmax-highlights/video/tuning-star-sidney-hoffmann-exklusiv-bei-dmax/191023082312316',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.discovery.de/#5332316765001',
|
||||
'url': 'https://www.dplay.co.uk/show/ghost-adventures/video/hotel-leger-103620/EHD_280313B',
|
||||
'only_matching': True,
|
||||
}]
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1659832546/default_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
alternate_id = mobj.group('alternate_id')
|
||||
if alternate_id:
|
||||
self._initialize_geo_bypass({
|
||||
'countries': ['DE'],
|
||||
})
|
||||
return self._get_disco_api_info(
|
||||
url, '%s/%s' % (mobj.group('programme'), alternate_id),
|
||||
'sonic-eu1-prod.disco-api.com', mobj.group('site') + 'de')
|
||||
brightcove_id = mobj.group('id')
|
||||
if not brightcove_id:
|
||||
title = mobj.group('title')
|
||||
webpage = self._download_webpage(url, title)
|
||||
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
|
||||
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(
|
||||
brightcove_legacy_url).query)['@videoPlayer'][0]
|
||||
return self.url_result(smuggle_url(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, {'geo_countries': ['DE']}),
|
||||
'BrightcoveNew', brightcove_id)
|
||||
domain, programme, alternate_id = re.match(self._VALID_URL, url).groups()
|
||||
country = 'GB' if domain == 'dplay.co.uk' else 'DE'
|
||||
realm = 'questuk' if country == 'GB' else domain.replace('.', '')
|
||||
return self._get_disco_api_info(
|
||||
url, '%s/%s' % (programme, alternate_id),
|
||||
'sonic-eu1-prod.disco-api.com', realm, country)
|
||||
|
@ -1,74 +1,68 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
import time
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_HTTPError,
|
||||
compat_str,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..compat import compat_HTTPError
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
remove_end,
|
||||
try_get,
|
||||
unified_strdate,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
urljoin,
|
||||
USER_AGENTS,
|
||||
)
|
||||
|
||||
|
||||
class DPlayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?P<domain>www\.(?P<host>dplay\.(?P<country>dk|se|no)))/(?:video(?:er|s)/)?(?P<id>[^/]+/[^/?#]+)'
|
||||
_VALID_URL = r'''(?x)https?://
|
||||
(?P<domain>
|
||||
(?:www\.)?(?P<host>dplay\.(?P<country>dk|fi|jp|se|no))|
|
||||
(?P<subdomain_country>es|it)\.dplay\.com
|
||||
)/[^/]+/(?P<id>[^/]+/[^/?#]+)'''
|
||||
|
||||
_TESTS = [{
|
||||
# non geo restricted, via secure api, unsigned download hls URL
|
||||
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
|
||||
'url': 'https://www.dplay.se/videos/nugammalt-77-handelser-som-format-sverige/nugammalt-77-handelser-som-format-sverige-101',
|
||||
'info_dict': {
|
||||
'id': '3172',
|
||||
'display_id': 'nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet',
|
||||
'id': '13628',
|
||||
'display_id': 'nugammalt-77-handelser-som-format-sverige/nugammalt-77-handelser-som-format-sverige-101',
|
||||
'ext': 'mp4',
|
||||
'title': 'Svensken lär sig njuta av livet',
|
||||
'description': 'md5:d3819c9bccffd0fe458ca42451dd50d8',
|
||||
'duration': 2650,
|
||||
'timestamp': 1365454320,
|
||||
'duration': 2649.856,
|
||||
'timestamp': 1365453720,
|
||||
'upload_date': '20130408',
|
||||
'creator': 'Kanal 5 (Home)',
|
||||
'creator': 'Kanal 5',
|
||||
'series': 'Nugammalt - 77 händelser som format Sverige',
|
||||
'season_number': 1,
|
||||
'episode_number': 1,
|
||||
'age_limit': 0,
|
||||
},
|
||||
'params': {
|
||||
'format': 'bestvideo',
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# geo restricted, via secure api, unsigned download hls URL
|
||||
'url': 'http://www.dplay.dk/mig-og-min-mor/season-6-episode-12/',
|
||||
'url': 'http://www.dplay.dk/videoer/ted-bundy-mind-of-a-monster/ted-bundy-mind-of-a-monster',
|
||||
'info_dict': {
|
||||
'id': '70816',
|
||||
'display_id': 'mig-og-min-mor/season-6-episode-12',
|
||||
'id': '104465',
|
||||
'display_id': 'ted-bundy-mind-of-a-monster/ted-bundy-mind-of-a-monster',
|
||||
'ext': 'mp4',
|
||||
'title': 'Episode 12',
|
||||
'description': 'md5:9c86e51a93f8a4401fc9641ef9894c90',
|
||||
'duration': 2563,
|
||||
'timestamp': 1429696800,
|
||||
'upload_date': '20150422',
|
||||
'creator': 'Kanal 4 (Home)',
|
||||
'series': 'Mig og min mor',
|
||||
'season_number': 6,
|
||||
'episode_number': 12,
|
||||
'age_limit': 0,
|
||||
'title': 'Ted Bundy: Mind Of A Monster',
|
||||
'description': 'md5:8b780f6f18de4dae631668b8a9637995',
|
||||
'duration': 5290.027,
|
||||
'timestamp': 1570694400,
|
||||
'upload_date': '20191010',
|
||||
'creator': 'ID - Investigation Discovery',
|
||||
'series': 'Ted Bundy: Mind Of A Monster',
|
||||
'season_number': 1,
|
||||
'episode_number': 1,
|
||||
},
|
||||
'params': {
|
||||
'format': 'bestvideo',
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# geo restricted, via direct unsigned hls URL
|
||||
'url': 'http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# disco-api
|
||||
'url': 'https://www.dplay.no/videoer/i-kongens-klr/sesong-1-episode-7',
|
||||
@ -89,19 +83,59 @@ class DPlayIE(InfoExtractor):
|
||||
'format': 'bestvideo',
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Available for Premium users',
|
||||
}, {
|
||||
|
||||
'url': 'https://www.dplay.dk/videoer/singleliv/season-5-episode-3',
|
||||
'url': 'http://it.dplay.com/nove/biografie-imbarazzanti/luigi-di-maio-la-psicosi-di-stanislawskij/',
|
||||
'md5': '2b808ffb00fc47b884a172ca5d13053c',
|
||||
'info_dict': {
|
||||
'id': '6918',
|
||||
'display_id': 'biografie-imbarazzanti/luigi-di-maio-la-psicosi-di-stanislawskij',
|
||||
'ext': 'mp4',
|
||||
'title': 'Luigi Di Maio: la psicosi di Stanislawskij',
|
||||
'description': 'md5:3c7a4303aef85868f867a26f5cc14813',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
||||
'upload_date': '20160524',
|
||||
'timestamp': 1464076800,
|
||||
'series': 'Biografie imbarazzanti',
|
||||
'season_number': 1,
|
||||
'episode': 'Episode 1',
|
||||
'episode_number': 1,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://es.dplay.com/dmax/la-fiebre-del-oro/temporada-8-episodio-1/',
|
||||
'info_dict': {
|
||||
'id': '21652',
|
||||
'display_id': 'la-fiebre-del-oro/temporada-8-episodio-1',
|
||||
'ext': 'mp4',
|
||||
'title': 'Episodio 1',
|
||||
'description': 'md5:b9dcff2071086e003737485210675f69',
|
||||
'thumbnail': r're:^https?://.*\.png',
|
||||
'upload_date': '20180709',
|
||||
'timestamp': 1531173540,
|
||||
'series': 'La fiebre del oro',
|
||||
'season_number': 8,
|
||||
'episode': 'Episode 1',
|
||||
'episode_number': 1,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.dplay.fi/videot/shifting-gears-with-aaron-kaufman/episode-16',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.dplay.se/videos/sofias-anglar/sofias-anglar-1001',
|
||||
'url': 'https://www.dplay.jp/video/gold-rush/24086',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _get_disco_api_info(self, url, display_id, disco_host, realm):
|
||||
disco_base = 'https://' + disco_host
|
||||
def _get_disco_api_info(self, url, display_id, disco_host, realm, country):
|
||||
geo_countries = [country.upper()]
|
||||
self._initialize_geo_bypass({
|
||||
'countries': geo_countries,
|
||||
})
|
||||
disco_base = 'https://%s/' % disco_host
|
||||
token = self._download_json(
|
||||
'%s/token' % disco_base, display_id, 'Downloading token',
|
||||
disco_base + 'token', display_id, 'Downloading token',
|
||||
query={
|
||||
'realm': realm,
|
||||
})['data']['attributes']['token']
|
||||
@ -110,17 +144,35 @@ class DPlayIE(InfoExtractor):
|
||||
'Authorization': 'Bearer ' + token,
|
||||
}
|
||||
video = self._download_json(
|
||||
'%s/content/videos/%s' % (disco_base, display_id), display_id,
|
||||
disco_base + 'content/videos/' + display_id, display_id,
|
||||
headers=headers, query={
|
||||
'include': 'show'
|
||||
'fields[channel]': 'name',
|
||||
'fields[image]': 'height,src,width',
|
||||
'fields[show]': 'name',
|
||||
'fields[tag]': 'name',
|
||||
'fields[video]': 'description,episodeNumber,name,publishStart,seasonNumber,videoDuration',
|
||||
'include': 'images,primaryChannel,show,tags'
|
||||
})
|
||||
video_id = video['data']['id']
|
||||
info = video['data']['attributes']
|
||||
title = info['name']
|
||||
title = info['name'].strip()
|
||||
formats = []
|
||||
for format_id, format_dict in self._download_json(
|
||||
'%s/playback/videoPlaybackInfo/%s' % (disco_base, video_id),
|
||||
display_id, headers=headers)['data']['attributes']['streaming'].items():
|
||||
try:
|
||||
streaming = self._download_json(
|
||||
disco_base + 'playback/videoPlaybackInfo/' + video_id,
|
||||
display_id, headers=headers)['data']['attributes']['streaming']
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
|
||||
info = self._parse_json(e.cause.read().decode('utf-8'), display_id)
|
||||
error = info['errors'][0]
|
||||
error_code = error.get('code')
|
||||
if error_code == 'access.denied.geoblocked':
|
||||
self.raise_geo_restricted(countries=geo_countries)
|
||||
elif error_code == 'access.denied.missingpackage':
|
||||
self.raise_login_required()
|
||||
raise ExtractorError(info['errors'][0]['detail'], expected=True)
|
||||
raise
|
||||
for format_id, format_dict in streaming.items():
|
||||
if not isinstance(format_dict, dict):
|
||||
continue
|
||||
format_url = format_dict.get('url')
|
||||
@ -142,235 +194,54 @@ class DPlayIE(InfoExtractor):
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
series = None
|
||||
try:
|
||||
included = video.get('included')
|
||||
if isinstance(included, list):
|
||||
show = next(e for e in included if e.get('type') == 'show')
|
||||
series = try_get(
|
||||
show, lambda x: x['attributes']['name'], compat_str)
|
||||
except StopIteration:
|
||||
pass
|
||||
creator = series = None
|
||||
tags = []
|
||||
thumbnails = []
|
||||
included = video.get('included') or []
|
||||
if isinstance(included, list):
|
||||
for e in included:
|
||||
attributes = e.get('attributes')
|
||||
if not attributes:
|
||||
continue
|
||||
e_type = e.get('type')
|
||||
if e_type == 'channel':
|
||||
creator = attributes.get('name')
|
||||
elif e_type == 'image':
|
||||
src = attributes.get('src')
|
||||
if src:
|
||||
thumbnails.append({
|
||||
'url': src,
|
||||
'width': int_or_none(attributes.get('width')),
|
||||
'height': int_or_none(attributes.get('height')),
|
||||
})
|
||||
if e_type == 'show':
|
||||
series = attributes.get('name')
|
||||
elif e_type == 'tag':
|
||||
name = attributes.get('name')
|
||||
if name:
|
||||
tags.append(name)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': info.get('description'),
|
||||
'duration': float_or_none(
|
||||
info.get('videoDuration'), scale=1000),
|
||||
'duration': float_or_none(info.get('videoDuration'), 1000),
|
||||
'timestamp': unified_timestamp(info.get('publishStart')),
|
||||
'series': series,
|
||||
'season_number': int_or_none(info.get('seasonNumber')),
|
||||
'episode_number': int_or_none(info.get('episodeNumber')),
|
||||
'age_limit': int_or_none(info.get('minimum_age')),
|
||||
'creator': creator,
|
||||
'tags': tags,
|
||||
'thumbnails': thumbnails,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
display_id = mobj.group('id')
|
||||
domain = mobj.group('domain')
|
||||
|
||||
self._initialize_geo_bypass({
|
||||
'countries': [mobj.group('country').upper()],
|
||||
})
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
video_id = self._search_regex(
|
||||
r'data-video-id=["\'](\d+)', webpage, 'video id', default=None)
|
||||
|
||||
if not video_id:
|
||||
host = mobj.group('host')
|
||||
return self._get_disco_api_info(
|
||||
url, display_id, 'disco-api.' + host, host.replace('.', ''))
|
||||
|
||||
info = self._download_json(
|
||||
'http://%s/api/v2/ajax/videos?video_id=%s' % (domain, video_id),
|
||||
video_id)['data'][0]
|
||||
|
||||
title = info['title']
|
||||
|
||||
PROTOCOLS = ('hls', 'hds')
|
||||
formats = []
|
||||
|
||||
def extract_formats(protocol, manifest_url):
|
||||
if protocol == 'hls':
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
manifest_url, video_id, ext='mp4',
|
||||
entry_protocol='m3u8_native', m3u8_id=protocol, fatal=False)
|
||||
# Sometimes final URLs inside m3u8 are unsigned, let's fix this
|
||||
# ourselves. Also fragments' URLs are only served signed for
|
||||
# Safari user agent.
|
||||
query = compat_urlparse.parse_qs(compat_urlparse.urlparse(manifest_url).query)
|
||||
for m3u8_format in m3u8_formats:
|
||||
m3u8_format.update({
|
||||
'url': update_url_query(m3u8_format['url'], query),
|
||||
'http_headers': {
|
||||
'User-Agent': USER_AGENTS['Safari'],
|
||||
},
|
||||
})
|
||||
formats.extend(m3u8_formats)
|
||||
elif protocol == 'hds':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
manifest_url + '&hdcore=3.8.0&plugin=flowplayer-3.8.0.0',
|
||||
video_id, f4m_id=protocol, fatal=False))
|
||||
|
||||
domain_tld = domain.split('.')[-1]
|
||||
if domain_tld in ('se', 'dk', 'no'):
|
||||
for protocol in PROTOCOLS:
|
||||
# Providing dsc-geo allows to bypass geo restriction in some cases
|
||||
self._set_cookie(
|
||||
'secure.dplay.%s' % domain_tld, 'dsc-geo',
|
||||
json.dumps({
|
||||
'countryCode': domain_tld.upper(),
|
||||
'expiry': (time.time() + 20 * 60) * 1000,
|
||||
}))
|
||||
stream = self._download_json(
|
||||
'https://secure.dplay.%s/secure/api/v2/user/authorization/stream/%s?stream_type=%s'
|
||||
% (domain_tld, video_id, protocol), video_id,
|
||||
'Downloading %s stream JSON' % protocol, fatal=False)
|
||||
if stream and stream.get(protocol):
|
||||
extract_formats(protocol, stream[protocol])
|
||||
|
||||
# The last resort is to try direct unsigned hls/hds URLs from info dictionary.
|
||||
# Sometimes this does work even when secure API with dsc-geo has failed (e.g.
|
||||
# http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/).
|
||||
if not formats:
|
||||
for protocol in PROTOCOLS:
|
||||
if info.get(protocol):
|
||||
extract_formats(protocol, info[protocol])
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
for lang in ('se', 'sv', 'da', 'nl', 'no'):
|
||||
for format_id in ('web_vtt', 'vtt', 'srt'):
|
||||
subtitle_url = info.get('subtitles_%s_%s' % (lang, format_id))
|
||||
if subtitle_url:
|
||||
subtitles.setdefault(lang, []).append({'url': subtitle_url})
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': info.get('video_metadata_longDescription'),
|
||||
'duration': int_or_none(info.get('video_metadata_length'), scale=1000),
|
||||
'timestamp': int_or_none(info.get('video_publish_date')),
|
||||
'creator': info.get('video_metadata_homeChannel'),
|
||||
'series': info.get('video_metadata_show'),
|
||||
'season_number': int_or_none(info.get('season')),
|
||||
'episode_number': int_or_none(info.get('episode')),
|
||||
'age_limit': int_or_none(info.get('minimum_age')),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
|
||||
class DPlayItIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://it\.dplay\.com/[^/]+/[^/]+/(?P<id>[^/?#]+)'
|
||||
_GEO_COUNTRIES = ['IT']
|
||||
_TEST = {
|
||||
'url': 'http://it.dplay.com/nove/biografie-imbarazzanti/luigi-di-maio-la-psicosi-di-stanislawskij/',
|
||||
'md5': '2b808ffb00fc47b884a172ca5d13053c',
|
||||
'info_dict': {
|
||||
'id': '6918',
|
||||
'display_id': 'luigi-di-maio-la-psicosi-di-stanislawskij',
|
||||
'ext': 'mp4',
|
||||
'title': 'Biografie imbarazzanti: Luigi Di Maio: la psicosi di Stanislawskij',
|
||||
'description': 'md5:3c7a4303aef85868f867a26f5cc14813',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
||||
'upload_date': '20160524',
|
||||
'series': 'Biografie imbarazzanti',
|
||||
'season_number': 1,
|
||||
'episode': 'Luigi Di Maio: la psicosi di Stanislawskij',
|
||||
'episode_number': 1,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
title = remove_end(self._og_search_title(webpage), ' | Dplay')
|
||||
|
||||
video_id = None
|
||||
|
||||
info = self._search_regex(
|
||||
r'playback_json\s*:\s*JSON\.parse\s*\(\s*("(?:\\.|[^"\\])+?")',
|
||||
webpage, 'playback JSON', default=None)
|
||||
if info:
|
||||
for _ in range(2):
|
||||
info = self._parse_json(info, display_id, fatal=False)
|
||||
if not info:
|
||||
break
|
||||
else:
|
||||
video_id = try_get(info, lambda x: x['data']['id'])
|
||||
|
||||
if not info:
|
||||
info_url = self._search_regex(
|
||||
(r'playback_json_url\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
r'url\s*[:=]\s*["\'](?P<url>(?:https?:)?//[^/]+/playback/videoPlaybackInfo/\d+)'),
|
||||
webpage, 'info url', group='url')
|
||||
|
||||
info_url = urljoin(url, info_url)
|
||||
video_id = info_url.rpartition('/')[-1]
|
||||
|
||||
try:
|
||||
info = self._download_json(
|
||||
info_url, display_id, headers={
|
||||
'Authorization': 'Bearer %s' % self._get_cookies(url).get(
|
||||
'dplayit_token').value,
|
||||
'Referer': url,
|
||||
})
|
||||
if isinstance(info, compat_str):
|
||||
info = self._parse_json(info, display_id)
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 403):
|
||||
info = self._parse_json(e.cause.read().decode('utf-8'), display_id)
|
||||
error = info['errors'][0]
|
||||
if error.get('code') == 'access.denied.geoblocked':
|
||||
self.raise_geo_restricted(
|
||||
msg=error.get('detail'), countries=self._GEO_COUNTRIES)
|
||||
raise ExtractorError(info['errors'][0]['detail'], expected=True)
|
||||
raise
|
||||
|
||||
hls_url = info['data']['attributes']['streaming']['hls']['url']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
hls_url, display_id, ext='mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls')
|
||||
self._sort_formats(formats)
|
||||
|
||||
series = self._html_search_regex(
|
||||
r'(?s)<h1[^>]+class=["\'].*?\bshow_title\b.*?["\'][^>]*>(.+?)</h1>',
|
||||
webpage, 'series', fatal=False)
|
||||
episode = self._search_regex(
|
||||
r'<p[^>]+class=["\'].*?\bdesc_ep\b.*?["\'][^>]*>\s*<br/>\s*<b>([^<]+)',
|
||||
webpage, 'episode', fatal=False)
|
||||
|
||||
mobj = re.search(
|
||||
r'(?s)<span[^>]+class=["\']dates["\'][^>]*>.+?\bS\.(?P<season_number>\d+)\s+E\.(?P<episode_number>\d+)\s*-\s*(?P<upload_date>\d{2}/\d{2}/\d{4})',
|
||||
webpage)
|
||||
if mobj:
|
||||
season_number = int(mobj.group('season_number'))
|
||||
episode_number = int(mobj.group('episode_number'))
|
||||
upload_date = unified_strdate(mobj.group('upload_date'))
|
||||
else:
|
||||
season_number = episode_number = upload_date = None
|
||||
|
||||
return {
|
||||
'id': compat_str(video_id or display_id),
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': self._og_search_description(webpage),
|
||||
'thumbnail': self._og_search_thumbnail(webpage),
|
||||
'series': series,
|
||||
'season_number': season_number,
|
||||
'episode': episode,
|
||||
'episode_number': episode_number,
|
||||
'upload_date': upload_date,
|
||||
'formats': formats,
|
||||
}
|
||||
domain = mobj.group('domain').lstrip('www.')
|
||||
country = mobj.group('country') or mobj.group('subdomain_country')
|
||||
host = 'disco-api.' + domain if domain.startswith('dplay.') else 'eu2-prod.disco-api.com'
|
||||
return self._get_disco_api_info(
|
||||
url, display_id, host, 'dplay' + country, country)
|
||||
|
@ -17,6 +17,7 @@ from ..utils import (
|
||||
float_or_none,
|
||||
mimetype2ext,
|
||||
str_or_none,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
url_or_none,
|
||||
@ -24,7 +25,14 @@ from ..utils import (
|
||||
|
||||
|
||||
class DRTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?dr\.dk/(?:tv/se|nyheder|radio(?:/ondemand)?)/(?:[^/]+/)*(?P<id>[\da-z-]+)(?:[/#?]|$)'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
(?:www\.)?dr\.dk/(?:tv/se|nyheder|radio(?:/ondemand)?)/(?:[^/]+/)*|
|
||||
(?:www\.)?(?:dr\.dk|dr-massive\.com)/drtv/(?:se|episode)/
|
||||
)
|
||||
(?P<id>[\da-z_-]+)
|
||||
'''
|
||||
_GEO_BYPASS = False
|
||||
_GEO_COUNTRIES = ['DK']
|
||||
IE_NAME = 'drtv'
|
||||
@ -83,6 +91,26 @@ class DRTVIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://www.dr.dk/radio/p4kbh/regionale-nyheder-kh4/p4-nyheder-2019-06-26-17-30-9',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.dr.dk/drtv/se/bonderoeven_71769',
|
||||
'info_dict': {
|
||||
'id': '00951930010',
|
||||
'ext': 'mp4',
|
||||
'title': 'Bonderøven (1:8)',
|
||||
'description': 'md5:3cf18fc0d3b205745d4505f896af8121',
|
||||
'timestamp': 1546542000,
|
||||
'upload_date': '20190103',
|
||||
'duration': 2576.6,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.dr.dk/drtv/episode/bonderoeven_71769',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://dr-massive.com/drtv/se/bonderoeven_71769',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -100,13 +128,32 @@ class DRTVIE(InfoExtractor):
|
||||
webpage, 'video id', default=None)
|
||||
|
||||
if not video_id:
|
||||
video_id = compat_urllib_parse_unquote(self._search_regex(
|
||||
video_id = self._search_regex(
|
||||
r'(urn(?:%3A|:)dr(?:%3A|:)mu(?:%3A|:)programcard(?:%3A|:)[\da-f]+)',
|
||||
webpage, 'urn'))
|
||||
webpage, 'urn', default=None)
|
||||
if video_id:
|
||||
video_id = compat_urllib_parse_unquote(video_id)
|
||||
|
||||
_PROGRAMCARD_BASE = 'https://www.dr.dk/mu-online/api/1.4/programcard'
|
||||
query = {'expanded': 'true'}
|
||||
|
||||
if video_id:
|
||||
programcard_url = '%s/%s' % (_PROGRAMCARD_BASE, video_id)
|
||||
else:
|
||||
programcard_url = _PROGRAMCARD_BASE
|
||||
page = self._parse_json(
|
||||
self._search_regex(
|
||||
r'data\s*=\s*({.+?})\s*(?:;|</script)', webpage,
|
||||
'data'), '1')['cache']['page']
|
||||
page = page[list(page.keys())[0]]
|
||||
item = try_get(
|
||||
page, (lambda x: x['item'], lambda x: x['entries'][0]['item']),
|
||||
dict)
|
||||
video_id = item['customId'].split(':')[-1]
|
||||
query['productionnumber'] = video_id
|
||||
|
||||
data = self._download_json(
|
||||
'https://www.dr.dk/mu-online/api/1.4/programcard/%s' % video_id,
|
||||
video_id, 'Downloading video JSON', query={'expanded': 'true'})
|
||||
programcard_url, video_id, 'Downloading video JSON', query=query)
|
||||
|
||||
title = str_or_none(data.get('Title')) or re.sub(
|
||||
r'\s*\|\s*(?:TV\s*\|\s*DR|DRTV)$', '',
|
||||
|
@ -1,20 +1,17 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_b64decode
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
qualities,
|
||||
sanitized_Request,
|
||||
)
|
||||
|
||||
|
||||
class DumpertIE(InfoExtractor):
|
||||
_VALID_URL = r'(?P<protocol>https?)://(?:www\.)?dumpert\.nl/(?:mediabase|embed)/(?P<id>[0-9]+/[0-9a-zA-Z]+)'
|
||||
_VALID_URL = r'(?P<protocol>https?)://(?:(?:www|legacy)\.)?dumpert\.nl/(?:mediabase|embed|item)/(?P<id>[0-9]+[/_][0-9a-zA-Z]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.dumpert.nl/mediabase/6646981/951bc60f/',
|
||||
'url': 'https://www.dumpert.nl/item/6646981_951bc60f',
|
||||
'md5': '1b9318d7d5054e7dcb9dc7654f21d643',
|
||||
'info_dict': {
|
||||
'id': '6646981/951bc60f',
|
||||
@ -24,46 +21,60 @@ class DumpertIE(InfoExtractor):
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.dumpert.nl/embed/6675421/dc440fe7/',
|
||||
'url': 'https://www.dumpert.nl/embed/6675421_dc440fe7',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://legacy.dumpert.nl/mediabase/6646981/951bc60f',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://legacy.dumpert.nl/embed/6675421/dc440fe7',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
protocol = mobj.group('protocol')
|
||||
|
||||
url = '%s://www.dumpert.nl/mediabase/%s' % (protocol, video_id)
|
||||
req = sanitized_Request(url)
|
||||
req.add_header('Cookie', 'nsfw=1; cpc=10')
|
||||
webpage = self._download_webpage(req, video_id)
|
||||
|
||||
files_base64 = self._search_regex(
|
||||
r'data-files="([^"]+)"', webpage, 'data files')
|
||||
|
||||
files = self._parse_json(
|
||||
compat_b64decode(files_base64).decode('utf-8'),
|
||||
video_id)
|
||||
video_id = self._match_id(url).replace('_', '/')
|
||||
item = self._download_json(
|
||||
'http://api-live.dumpert.nl/mobile_api/json/info/' + video_id.replace('/', '_'),
|
||||
video_id)['items'][0]
|
||||
title = item['title']
|
||||
media = next(m for m in item['media'] if m.get('mediatype') == 'VIDEO')
|
||||
|
||||
quality = qualities(['flv', 'mobile', 'tablet', '720p'])
|
||||
|
||||
formats = [{
|
||||
'url': video_url,
|
||||
'format_id': format_id,
|
||||
'quality': quality(format_id),
|
||||
} for format_id, video_url in files.items() if format_id != 'still']
|
||||
formats = []
|
||||
for variant in media.get('variants', []):
|
||||
uri = variant.get('uri')
|
||||
if not uri:
|
||||
continue
|
||||
version = variant.get('version')
|
||||
formats.append({
|
||||
'url': uri,
|
||||
'format_id': version,
|
||||
'quality': quality(version),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
title = self._html_search_meta(
|
||||
'title', webpage) or self._og_search_title(webpage)
|
||||
description = self._html_search_meta(
|
||||
'description', webpage) or self._og_search_description(webpage)
|
||||
thumbnail = files.get('still') or self._og_search_thumbnail(webpage)
|
||||
thumbnails = []
|
||||
stills = item.get('stills') or {}
|
||||
for t in ('thumb', 'still'):
|
||||
for s in ('', '-medium', '-large'):
|
||||
still_id = t + s
|
||||
still_url = stills.get(still_id)
|
||||
if not still_url:
|
||||
continue
|
||||
thumbnails.append({
|
||||
'id': still_id,
|
||||
'url': still_url,
|
||||
})
|
||||
|
||||
stats = item.get('stats') or {}
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'formats': formats
|
||||
'description': item.get('description'),
|
||||
'thumbnails': thumbnails,
|
||||
'formats': formats,
|
||||
'duration': int_or_none(media.get('duration')),
|
||||
'like_count': int_or_none(stats.get('kudos_total')),
|
||||
'view_count': int_or_none(stats.get('views_total')),
|
||||
}
|
||||
|
@ -18,7 +18,6 @@ from .acast import (
|
||||
ACastIE,
|
||||
ACastChannelIE,
|
||||
)
|
||||
from .addanime import AddAnimeIE
|
||||
from .adn import ADNIE
|
||||
from .adobeconnect import AdobeConnectIE
|
||||
from .adobetv import (
|
||||
@ -80,7 +79,6 @@ from .awaan import (
|
||||
)
|
||||
from .azmedien import AZMedienIE
|
||||
from .baidu import BaiduVideoIE
|
||||
from .bambuser import BambuserIE, BambuserChannelIE
|
||||
from .bandcamp import BandcampIE, BandcampAlbumIE, BandcampWeeklyIE
|
||||
from .bbc import (
|
||||
BBCCoUkIE,
|
||||
@ -104,6 +102,8 @@ from .bild import BildIE
|
||||
from .bilibili import (
|
||||
BiliBiliIE,
|
||||
BiliBiliBangumiIE,
|
||||
BilibiliAudioIE,
|
||||
BilibiliAudioAlbumIE,
|
||||
)
|
||||
from .biobiochiletv import BioBioChileTVIE
|
||||
from .bitchute import (
|
||||
@ -222,13 +222,13 @@ from .comedycentral import (
|
||||
ComedyCentralTVIE,
|
||||
ToshIE,
|
||||
)
|
||||
from .comcarcoff import ComCarCoffIE
|
||||
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
|
||||
from .commonprotocols import (
|
||||
MmsIE,
|
||||
RtmpIE,
|
||||
)
|
||||
from .condenast import CondeNastIE
|
||||
from .contv import CONtvIE
|
||||
from .corus import CorusIE
|
||||
from .cracked import CrackedIE
|
||||
from .crackle import CrackleIE
|
||||
@ -252,10 +252,6 @@ from .dailymotion import (
|
||||
DailymotionPlaylistIE,
|
||||
DailymotionUserIE,
|
||||
)
|
||||
from .daisuki import (
|
||||
DaisukiMottoIE,
|
||||
DaisukiMottoPlaylistIE,
|
||||
)
|
||||
from .daum import (
|
||||
DaumIE,
|
||||
DaumClipIE,
|
||||
@ -274,10 +270,7 @@ from .douyutv import (
|
||||
DouyuShowIE,
|
||||
DouyuTVIE,
|
||||
)
|
||||
from .dplay import (
|
||||
DPlayIE,
|
||||
DPlayItIE,
|
||||
)
|
||||
from .dplay import DPlayIE
|
||||
from .dreisat import DreiSatIE
|
||||
from .drbonanza import DRBonanzaIE
|
||||
from .drtuber import DrTuberIE
|
||||
@ -356,7 +349,6 @@ from .firsttv import FirstTVIE
|
||||
from .fivemin import FiveMinIE
|
||||
from .fivetv import FiveTVIE
|
||||
from .flickr import FlickrIE
|
||||
from .flipagram import FlipagramIE
|
||||
from .folketinget import FolketingetIE
|
||||
from .footyroom import FootyRoomIE
|
||||
from .formula1 import Formula1IE
|
||||
@ -367,7 +359,10 @@ from .fourtube import (
|
||||
FuxIE,
|
||||
)
|
||||
from .fox import FOXIE
|
||||
from .fox9 import FOX9IE
|
||||
from .fox9 import (
|
||||
FOX9IE,
|
||||
FOX9NewsIE,
|
||||
)
|
||||
from .foxgay import FoxgayIE
|
||||
from .foxnews import (
|
||||
FoxNewsIE,
|
||||
@ -400,10 +395,6 @@ from .fusion import FusionIE
|
||||
from .fxnetworks import FXNetworksIE
|
||||
from .gaia import GaiaIE
|
||||
from .gameinformer import GameInformerIE
|
||||
from .gameone import (
|
||||
GameOneIE,
|
||||
GameOnePlaylistIE,
|
||||
)
|
||||
from .gamespot import GameSpotIE
|
||||
from .gamestar import GameStarIE
|
||||
from .gaskrank import GaskrankIE
|
||||
@ -419,7 +410,6 @@ from .globo import (
|
||||
GloboArticleIE,
|
||||
)
|
||||
from .go import GoIE
|
||||
from .go90 import Go90IE
|
||||
from .godtube import GodTubeIE
|
||||
from .golem import GolemIE
|
||||
from .googledrive import GoogleDriveIE
|
||||
@ -428,7 +418,6 @@ from .googlesearch import GoogleSearchIE
|
||||
from .goshgay import GoshgayIE
|
||||
from .gputechconf import GPUTechConfIE
|
||||
from .groupon import GrouponIE
|
||||
from .hark import HarkIE
|
||||
from .hbo import HBOIE
|
||||
from .hearthisat import HearThisAtIE
|
||||
from .heise import HeiseIE
|
||||
@ -460,7 +449,6 @@ from .hungama import (
|
||||
HungamaSongIE,
|
||||
)
|
||||
from .hypem import HypemIE
|
||||
from .iconosquare import IconosquareIE
|
||||
from .ign import (
|
||||
IGNIE,
|
||||
OneUPIE,
|
||||
@ -519,8 +507,8 @@ from .keezmovies import KeezMoviesIE
|
||||
from .ketnet import KetnetIE
|
||||
from .khanacademy import KhanAcademyIE
|
||||
from .kickstarter import KickStarterIE
|
||||
from .kinja import KinjaEmbedIE
|
||||
from .kinopoisk import KinoPoiskIE
|
||||
from .keek import KeekIE
|
||||
from .konserthusetplay import KonserthusetPlayIE
|
||||
from .kontrtube import KontrTubeIE
|
||||
from .krasview import KrasViewIE
|
||||
@ -546,7 +534,6 @@ from .lcp import (
|
||||
LcpPlayIE,
|
||||
LcpIE,
|
||||
)
|
||||
from .learnr import LearnrIE
|
||||
from .lecture2go import Lecture2GoIE
|
||||
from .lecturio import (
|
||||
LecturioIE,
|
||||
@ -598,13 +585,11 @@ from .lynda import (
|
||||
LyndaCourseIE
|
||||
)
|
||||
from .m6 import M6IE
|
||||
from .macgamestore import MacGameStoreIE
|
||||
from .mailru import (
|
||||
MailRuIE,
|
||||
MailRuMusicIE,
|
||||
MailRuMusicSearchIE,
|
||||
)
|
||||
from .makertv import MakerTVIE
|
||||
from .makotv import MakoTVIE
|
||||
from .malltv import MallTVIE
|
||||
from .mangomolo import (
|
||||
@ -639,17 +624,15 @@ from .microsoftvirtualacademy import (
|
||||
MicrosoftVirtualAcademyIE,
|
||||
MicrosoftVirtualAcademyCourseIE,
|
||||
)
|
||||
from .minhateca import MinhatecaIE
|
||||
from .ministrygrid import MinistryGridIE
|
||||
from .minoto import MinotoIE
|
||||
from .miomio import MioMioIE
|
||||
from .mit import TechTVMITIE, MITIE, OCWMITIE
|
||||
from .mit import TechTVMITIE, OCWMITIE
|
||||
from .mitele import MiTeleIE
|
||||
from .mixcloud import (
|
||||
MixcloudIE,
|
||||
MixcloudUserIE,
|
||||
MixcloudPlaylistIE,
|
||||
MixcloudStreamIE,
|
||||
)
|
||||
from .mlb import MLBIE
|
||||
from .mnet import MnetIE
|
||||
@ -671,7 +654,7 @@ from .mtv import (
|
||||
MTVVideoIE,
|
||||
MTVServicesEmbeddedIE,
|
||||
MTVDEIE,
|
||||
MTV81IE,
|
||||
MTVJapanIE,
|
||||
)
|
||||
from .muenchentv import MuenchenTVIE
|
||||
from .musicplayon import MusicPlayOnIE
|
||||
@ -892,7 +875,6 @@ from .puhutv import (
|
||||
PuhuTVSerieIE,
|
||||
)
|
||||
from .presstv import PressTVIE
|
||||
from .promptfile import PromptFileIE
|
||||
from .prosiebensat1 import ProSiebenSat1IE
|
||||
from .puls4 import Puls4IE
|
||||
from .pyvideo import PyvideoIE
|
||||
@ -944,10 +926,6 @@ from .rentv import (
|
||||
from .restudy import RestudyIE
|
||||
from .reuters import ReutersIE
|
||||
from .reverbnation import ReverbNationIE
|
||||
from .revision3 import (
|
||||
Revision3EmbedIE,
|
||||
Revision3IE,
|
||||
)
|
||||
from .rice import RICEIE
|
||||
from .rmcdecouverte import RMCDecouverteIE
|
||||
from .ro220 import Ro220IE
|
||||
@ -992,10 +970,13 @@ from .sbs import SBSIE
|
||||
from .screencast import ScreencastIE
|
||||
from .screencastomatic import ScreencastOMaticIE
|
||||
from .scrippsnetworks import ScrippsNetworksWatchIE
|
||||
from .scte import (
|
||||
SCTEIE,
|
||||
SCTECourseIE,
|
||||
)
|
||||
from .seeker import SeekerIE
|
||||
from .senateisvp import SenateISVPIE
|
||||
from .sendtonews import SendtoNewsIE
|
||||
from .servingsys import ServingSysIE
|
||||
from .servus import ServusIE
|
||||
from .sevenplus import SevenPlusIE
|
||||
from .sexu import SexuIE
|
||||
@ -1036,6 +1017,7 @@ from .snotr import SnotrIE
|
||||
from .sohu import SohuIE
|
||||
from .sonyliv import SonyLIVIE
|
||||
from .soundcloud import (
|
||||
SoundcloudEmbedIE,
|
||||
SoundcloudIE,
|
||||
SoundcloudSetIE,
|
||||
SoundcloudUserIE,
|
||||
@ -1128,12 +1110,14 @@ from .telegraaf import TelegraafIE
|
||||
from .telemb import TeleMBIE
|
||||
from .telequebec import (
|
||||
TeleQuebecIE,
|
||||
TeleQuebecSquatIE,
|
||||
TeleQuebecEmissionIE,
|
||||
TeleQuebecLiveIE,
|
||||
)
|
||||
from .teletask import TeleTaskIE
|
||||
from .telewebion import TelewebionIE
|
||||
from .tennistv import TennisTVIE
|
||||
from .tenplay import TenPlayIE
|
||||
from .testurl import TestURLIE
|
||||
from .tf1 import TF1IE
|
||||
from .tfo import TFOIE
|
||||
@ -1186,11 +1170,11 @@ from .tunein import (
|
||||
)
|
||||
from .tunepk import TunePkIE
|
||||
from .turbo import TurboIE
|
||||
from .tutv import TutvIE
|
||||
from .tv2 import (
|
||||
TV2IE,
|
||||
TV2ArticleIE,
|
||||
)
|
||||
from .tv2dk import TV2DKIE
|
||||
from .tv2hu import TV2HuIE
|
||||
from .tv4 import TV4IE
|
||||
from .tv5mondeplus import TV5MondePlusIE
|
||||
@ -1247,6 +1231,7 @@ from .twitter import (
|
||||
TwitterCardIE,
|
||||
TwitterIE,
|
||||
TwitterAmplifyIE,
|
||||
TwitterBroadcastIE,
|
||||
)
|
||||
from .udemy import (
|
||||
UdemyIE,
|
||||
@ -1281,7 +1266,6 @@ from .varzesh3 import Varzesh3IE
|
||||
from .vbox7 import Vbox7IE
|
||||
from .veehd import VeeHDIE
|
||||
from .veoh import VeohIE
|
||||
from .vessel import VesselIE
|
||||
from .vesti import VestiIE
|
||||
from .vevo import (
|
||||
VevoIE,
|
||||
@ -1323,7 +1307,6 @@ from .viewlift import (
|
||||
ViewLiftIE,
|
||||
ViewLiftEmbedIE,
|
||||
)
|
||||
from .viewster import ViewsterIE
|
||||
from .viidea import ViideaIE
|
||||
from .vimeo import (
|
||||
VimeoIE,
|
||||
@ -1412,7 +1395,6 @@ from .weibo import (
|
||||
WeiboMobileIE
|
||||
)
|
||||
from .weiqitv import WeiqiTVIE
|
||||
from .wimp import WimpIE
|
||||
from .wistia import WistiaIE
|
||||
from .worldstarhiphop import WorldStarHipHopIE
|
||||
from .wsj import (
|
||||
|
@ -334,7 +334,7 @@ class FacebookIE(InfoExtractor):
|
||||
if not video_data:
|
||||
server_js_data = self._parse_json(
|
||||
self._search_regex(
|
||||
r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+(?:stream_pagelet|pagelet_group_mall|permalink_video_pagelet)',
|
||||
r'bigPipe\.onPageletArrive\(({.+?})\)\s*;\s*}\s*\)\s*,\s*["\']onPageletArrive\s+(?:pagelet_group_mall|permalink_video_pagelet|hyperfeed_story_id_\d+)',
|
||||
webpage, 'js data', default='{}'),
|
||||
video_id, transform_source=js_to_json, fatal=False)
|
||||
video_data = extract_from_jsmods_instances(server_js_data)
|
||||
@ -379,6 +379,7 @@ class FacebookIE(InfoExtractor):
|
||||
if not video_data:
|
||||
raise ExtractorError('Cannot parse data')
|
||||
|
||||
subtitles = {}
|
||||
formats = []
|
||||
for f in video_data:
|
||||
format_id = f['stream_type']
|
||||
@ -402,9 +403,17 @@ class FacebookIE(InfoExtractor):
|
||||
if dash_manifest:
|
||||
formats.extend(self._parse_mpd_formats(
|
||||
compat_etree_fromstring(compat_urllib_parse_unquote_plus(dash_manifest))))
|
||||
subtitles_src = f[0].get('subtitles_src')
|
||||
if subtitles_src:
|
||||
subtitles.setdefault('en', []).append({'url': subtitles_src})
|
||||
if not formats:
|
||||
raise ExtractorError('Cannot find video formats')
|
||||
|
||||
# Downloads with browser's User-Agent are rate limited. Working around
|
||||
# with non-browser User-Agent.
|
||||
for f in formats:
|
||||
f.setdefault('http_headers', {})['User-Agent'] = 'facebookexternalhit/1.1'
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
video_title = self._html_search_regex(
|
||||
@ -442,6 +451,7 @@ class FacebookIE(InfoExtractor):
|
||||
'timestamp': timestamp,
|
||||
'thumbnail': thumbnail,
|
||||
'view_count': view_count,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
return webpage, info_dict
|
||||
|
@ -1,115 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
float_or_none,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
)
|
||||
|
||||
|
||||
class FlipagramIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?flipagram\.com/f/(?P<id>[^/?#&]+)'
|
||||
_TEST = {
|
||||
'url': 'https://flipagram.com/f/nyvTSJMKId',
|
||||
'md5': '888dcf08b7ea671381f00fab74692755',
|
||||
'info_dict': {
|
||||
'id': 'nyvTSJMKId',
|
||||
'ext': 'mp4',
|
||||
'title': 'Flipagram by sjuria101 featuring Midnight Memories by One Direction',
|
||||
'description': 'md5:d55e32edc55261cae96a41fa85ff630e',
|
||||
'duration': 35.571,
|
||||
'timestamp': 1461244995,
|
||||
'upload_date': '20160421',
|
||||
'uploader': 'kitty juria',
|
||||
'uploader_id': 'sjuria101',
|
||||
'creator': 'kitty juria',
|
||||
'view_count': int,
|
||||
'like_count': int,
|
||||
'repost_count': int,
|
||||
'comment_count': int,
|
||||
'comments': list,
|
||||
'formats': 'mincount:2',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_data = self._parse_json(
|
||||
self._search_regex(
|
||||
r'window\.reactH2O\s*=\s*({.+});', webpage, 'video data'),
|
||||
video_id)
|
||||
|
||||
flipagram = video_data['flipagram']
|
||||
video = flipagram['video']
|
||||
|
||||
json_ld = self._search_json_ld(webpage, video_id, default={})
|
||||
title = json_ld.get('title') or flipagram['captionText']
|
||||
description = json_ld.get('description') or flipagram.get('captionText')
|
||||
|
||||
formats = [{
|
||||
'url': video['url'],
|
||||
'width': int_or_none(video.get('width')),
|
||||
'height': int_or_none(video.get('height')),
|
||||
'filesize': int_or_none(video_data.get('size')),
|
||||
}]
|
||||
|
||||
preview_url = try_get(
|
||||
flipagram, lambda x: x['music']['track']['previewUrl'], compat_str)
|
||||
if preview_url:
|
||||
formats.append({
|
||||
'url': preview_url,
|
||||
'ext': 'm4a',
|
||||
'vcodec': 'none',
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
counts = flipagram.get('counts', {})
|
||||
user = flipagram.get('user', {})
|
||||
video_data = flipagram.get('video', {})
|
||||
|
||||
thumbnails = [{
|
||||
'url': self._proto_relative_url(cover['url']),
|
||||
'width': int_or_none(cover.get('width')),
|
||||
'height': int_or_none(cover.get('height')),
|
||||
'filesize': int_or_none(cover.get('size')),
|
||||
} for cover in flipagram.get('covers', []) if cover.get('url')]
|
||||
|
||||
# Note that this only retrieves comments that are initially loaded.
|
||||
# For videos with large amounts of comments, most won't be retrieved.
|
||||
comments = []
|
||||
for comment in video_data.get('comments', {}).get(video_id, {}).get('items', []):
|
||||
text = comment.get('comment')
|
||||
if not text or not isinstance(text, list):
|
||||
continue
|
||||
comments.append({
|
||||
'author': comment.get('user', {}).get('name'),
|
||||
'author_id': comment.get('user', {}).get('username'),
|
||||
'id': comment.get('id'),
|
||||
'text': text[0],
|
||||
'timestamp': unified_timestamp(comment.get('created')),
|
||||
})
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'duration': float_or_none(flipagram.get('duration'), 1000),
|
||||
'thumbnails': thumbnails,
|
||||
'timestamp': unified_timestamp(flipagram.get('iso8601Created')),
|
||||
'uploader': user.get('name'),
|
||||
'uploader_id': user.get('username'),
|
||||
'creator': user.get('name'),
|
||||
'view_count': int_or_none(counts.get('plays')),
|
||||
'like_count': int_or_none(counts.get('likes')),
|
||||
'repost_count': int_or_none(counts.get('reflips')),
|
||||
'comment_count': int_or_none(counts.get('comments')),
|
||||
'comments': comments,
|
||||
'formats': formats,
|
||||
}
|
@ -1,13 +1,23 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .anvato import AnvatoIE
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class FOX9IE(AnvatoIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?fox9\.com/(?:[^/]+/)+(?P<id>\d+)-story'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.fox9.com/news/215123287-story',
|
||||
class FOX9IE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?fox9\.com/video/(?P<id>\d+)'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
return self.url_result(
|
||||
'anvato:anvato_epfox_app_web_prod_b3373168e12f423f41504f207000188daf88251b:' + video_id,
|
||||
'Anvato', video_id)
|
||||
|
||||
|
||||
class FOX9NewsIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?fox9\.com/news/(?P<id>[^/?&#]+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.fox9.com/news/black-bear-in-tree-draws-crowd-in-downtown-duluth-minnesota',
|
||||
'md5': 'd6e1b2572c3bab8a849c9103615dd243',
|
||||
'info_dict': {
|
||||
'id': '314473',
|
||||
@ -21,22 +31,11 @@ class FOX9IE(AnvatoIE):
|
||||
'categories': ['News', 'Sports'],
|
||||
'tags': ['news', 'video'],
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.fox9.com/news/investigators/214070684-story',
|
||||
'only_matching': True,
|
||||
}]
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_id = self._parse_json(
|
||||
self._search_regex(
|
||||
r"this\.videosJson\s*=\s*'(\[.+?\])';",
|
||||
webpage, 'anvato playlist'),
|
||||
video_id)[0]['video']
|
||||
|
||||
return self._get_anvato_videos(
|
||||
'anvato_epfox_app_web_prod_b3373168e12f423f41504f207000188daf88251b',
|
||||
video_id)
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
anvato_id = self._search_regex(
|
||||
r'anvatoId\s*:\s*[\'"](\d+)', webpage, 'anvato id')
|
||||
return self.url_result('https://www.fox9.com/video/' + anvato_id, 'FOX9')
|
||||
|
@ -1,134 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
xpath_with_ns,
|
||||
parse_iso8601,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
NAMESPACE_MAP = {
|
||||
'media': 'http://search.yahoo.com/mrss/',
|
||||
}
|
||||
|
||||
# URL prefix to download the mp4 files directly instead of streaming via rtmp
|
||||
# Credits go to XBox-Maniac
|
||||
# http://board.jdownloader.org/showpost.php?p=185835&postcount=31
|
||||
RAW_MP4_URL = 'http://cdn.riptide-mtvn.com/'
|
||||
|
||||
|
||||
class GameOneIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?gameone\.de/tv/(?P<id>\d+)'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://www.gameone.de/tv/288',
|
||||
'md5': '136656b7fb4c9cb4a8e2d500651c499b',
|
||||
'info_dict': {
|
||||
'id': '288',
|
||||
'ext': 'mp4',
|
||||
'title': 'Game One - Folge 288',
|
||||
'duration': 1238,
|
||||
'thumbnail': 'http://s3.gameone.de/gameone/assets/video_metas/teaser_images/000/643/636/big/640x360.jpg',
|
||||
'description': 'FIFA-Pressepokal 2014, Star Citizen, Kingdom Come: Deliverance, Project Cars, Schöner Trants Nerdquiz Folge 2 Runde 1',
|
||||
'age_limit': 16,
|
||||
'upload_date': '20140513',
|
||||
'timestamp': 1399980122,
|
||||
}
|
||||
},
|
||||
{
|
||||
'url': 'http://gameone.de/tv/220',
|
||||
'md5': '5227ca74c4ae6b5f74c0510a7c48839e',
|
||||
'info_dict': {
|
||||
'id': '220',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20120918',
|
||||
'description': 'Jet Set Radio HD, Tekken Tag Tournament 2, Source Filmmaker',
|
||||
'timestamp': 1347971451,
|
||||
'title': 'Game One - Folge 220',
|
||||
'duration': 896.62,
|
||||
'age_limit': 16,
|
||||
}
|
||||
}
|
||||
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
og_video = self._og_search_video_url(webpage, secure=False)
|
||||
description = self._html_search_meta('description', webpage)
|
||||
age_limit = int(
|
||||
self._search_regex(
|
||||
r'age=(\d+)',
|
||||
self._html_search_meta(
|
||||
'age-de-meta-label',
|
||||
webpage),
|
||||
'age_limit',
|
||||
'0'))
|
||||
mrss_url = self._search_regex(r'mrss=([^&]+)', og_video, 'mrss')
|
||||
|
||||
mrss = self._download_xml(mrss_url, video_id, 'Downloading mrss')
|
||||
title = mrss.find('.//item/title').text
|
||||
thumbnail = mrss.find('.//item/image').get('url')
|
||||
timestamp = parse_iso8601(mrss.find('.//pubDate').text, delimiter=' ')
|
||||
content = mrss.find(xpath_with_ns('.//media:content', NAMESPACE_MAP))
|
||||
content_url = content.get('url')
|
||||
|
||||
content = self._download_xml(
|
||||
content_url,
|
||||
video_id,
|
||||
'Downloading media:content')
|
||||
rendition_items = content.findall('.//rendition')
|
||||
duration = float_or_none(rendition_items[0].get('duration'))
|
||||
formats = [
|
||||
{
|
||||
'url': re.sub(r'.*/(r2)', RAW_MP4_URL + r'\1', r.find('./src').text),
|
||||
'width': int_or_none(r.get('width')),
|
||||
'height': int_or_none(r.get('height')),
|
||||
'tbr': int_or_none(r.get('bitrate')),
|
||||
}
|
||||
for r in rendition_items
|
||||
]
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
'description': description,
|
||||
'age_limit': age_limit,
|
||||
'timestamp': timestamp,
|
||||
}
|
||||
|
||||
|
||||
class GameOnePlaylistIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?gameone\.de(?:/tv)?/?$'
|
||||
IE_NAME = 'gameone:playlist'
|
||||
_TEST = {
|
||||
'url': 'http://www.gameone.de/tv',
|
||||
'info_dict': {
|
||||
'title': 'GameOne',
|
||||
},
|
||||
'playlist_mincount': 294,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
webpage = self._download_webpage('http://www.gameone.de/tv', 'TV')
|
||||
max_id = max(map(int, re.findall(r'<a href="/tv/(\d+)"', webpage)))
|
||||
entries = [
|
||||
self.url_result('http://www.gameone.de/tv/%d' %
|
||||
video_id, 'GameOne')
|
||||
for video_id in range(max_id, 0, -1)]
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'title': 'GameOne',
|
||||
'entries': entries,
|
||||
}
|
@ -77,11 +77,10 @@ from .instagram import InstagramIE
|
||||
from .liveleak import LiveLeakIE
|
||||
from .threeqsdn import ThreeQSDNIE
|
||||
from .theplatform import ThePlatformIE
|
||||
from .vessel import VesselIE
|
||||
from .kaltura import KalturaIE
|
||||
from .eagleplatform import EaglePlatformIE
|
||||
from .facebook import FacebookIE
|
||||
from .soundcloud import SoundcloudIE
|
||||
from .soundcloud import SoundcloudEmbedIE
|
||||
from .tunein import TuneInBaseIE
|
||||
from .vbox7 import Vbox7IE
|
||||
from .dbtv import DBTVIE
|
||||
@ -119,6 +118,8 @@ from .foxnews import FoxNewsIE
|
||||
from .viqeo import ViqeoIE
|
||||
from .expressen import ExpressenIE
|
||||
from .zype import ZypeIE
|
||||
from .odnoklassniki import OdnoklassnikiIE
|
||||
from .kinja import KinjaEmbedIE
|
||||
|
||||
|
||||
class GenericIE(InfoExtractor):
|
||||
@ -1487,16 +1488,18 @@ class GenericIE(InfoExtractor):
|
||||
'timestamp': 1432570283,
|
||||
},
|
||||
},
|
||||
# OnionStudios embed
|
||||
# Kinja embed
|
||||
{
|
||||
'url': 'http://www.clickhole.com/video/dont-understand-bitcoin-man-will-mumble-explanatio-2537',
|
||||
'info_dict': {
|
||||
'id': '2855',
|
||||
'id': '106351',
|
||||
'ext': 'mp4',
|
||||
'title': 'Don’t Understand Bitcoin? This Man Will Mumble An Explanation At You',
|
||||
'description': 'Migrated from OnionStudios',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g$',
|
||||
'uploader': 'ClickHole',
|
||||
'uploader_id': 'clickhole',
|
||||
'uploader': 'clickhole',
|
||||
'upload_date': '20150527',
|
||||
'timestamp': 1432744860,
|
||||
}
|
||||
},
|
||||
# SnagFilms embed
|
||||
@ -2491,11 +2494,6 @@ class GenericIE(InfoExtractor):
|
||||
if tp_urls:
|
||||
return self.playlist_from_matches(tp_urls, video_id, video_title, ie='ThePlatform')
|
||||
|
||||
# Look for Vessel embeds
|
||||
vessel_urls = VesselIE._extract_urls(webpage)
|
||||
if vessel_urls:
|
||||
return self.playlist_from_matches(vessel_urls, video_id, video_title, ie=VesselIE.ie_key())
|
||||
|
||||
# Look for embedded rtl.nl player
|
||||
matches = re.findall(
|
||||
r'<iframe[^>]+?src="((?:https?:)?//(?:(?:www|static)\.)?rtl\.nl/(?:system/videoplayer/[^"]+(?:video_)?)?embed[^"]+)"',
|
||||
@ -2633,9 +2631,9 @@ class GenericIE(InfoExtractor):
|
||||
return self.url_result(mobj.group('url'), 'VK')
|
||||
|
||||
# Look for embedded Odnoklassniki player
|
||||
mobj = re.search(r'<iframe[^>]+?src=(["\'])(?P<url>https?://(?:odnoklassniki|ok)\.ru/videoembed/.+?)\1', webpage)
|
||||
if mobj is not None:
|
||||
return self.url_result(mobj.group('url'), 'Odnoklassniki')
|
||||
odnoklassniki_url = OdnoklassnikiIE._extract_url(webpage)
|
||||
if odnoklassniki_url:
|
||||
return self.url_result(odnoklassniki_url, OdnoklassnikiIE.ie_key())
|
||||
|
||||
# Look for embedded ivi player
|
||||
mobj = re.search(r'<embed[^>]+?src=(["\'])(?P<url>https?://(?:www\.)?ivi\.ru/video/player.+?)\1', webpage)
|
||||
@ -2754,9 +2752,9 @@ class GenericIE(InfoExtractor):
|
||||
return self.url_result(myvi_url)
|
||||
|
||||
# Look for embedded soundcloud player
|
||||
soundcloud_urls = SoundcloudIE._extract_urls(webpage)
|
||||
soundcloud_urls = SoundcloudEmbedIE._extract_urls(webpage)
|
||||
if soundcloud_urls:
|
||||
return self.playlist_from_matches(soundcloud_urls, video_id, video_title, getter=unescapeHTML, ie=SoundcloudIE.ie_key())
|
||||
return self.playlist_from_matches(soundcloud_urls, video_id, video_title, getter=unescapeHTML)
|
||||
|
||||
# Look for tunein player
|
||||
tunein_urls = TuneInBaseIE._extract_urls(webpage)
|
||||
@ -2899,6 +2897,12 @@ class GenericIE(InfoExtractor):
|
||||
if senate_isvp_url:
|
||||
return self.url_result(senate_isvp_url, 'SenateISVP')
|
||||
|
||||
# Look for Kinja embeds
|
||||
kinja_embed_urls = KinjaEmbedIE._extract_urls(webpage, url)
|
||||
if kinja_embed_urls:
|
||||
return self.playlist_from_matches(
|
||||
kinja_embed_urls, video_id, video_title)
|
||||
|
||||
# Look for OnionStudios embeds
|
||||
onionstudios_url = OnionStudiosIE._extract_url(webpage)
|
||||
if onionstudios_url:
|
||||
@ -2968,10 +2972,14 @@ class GenericIE(InfoExtractor):
|
||||
|
||||
# Look for Mangomolo embeds
|
||||
mobj = re.search(
|
||||
r'''(?x)<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?admin\.mangomolo\.com/analytics/index\.php/customers/embed/
|
||||
r'''(?x)<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//
|
||||
(?:
|
||||
admin\.mangomolo\.com/analytics/index\.php/customers/embed|
|
||||
player\.mangomolo\.com/v1
|
||||
)/
|
||||
(?:
|
||||
video\?.*?\bid=(?P<video_id>\d+)|
|
||||
index\?.*?\bchannelid=(?P<channel_id>(?:[A-Za-z0-9+/=]|%2B|%2F|%3D)+)
|
||||
(?:index|live)\?.*?\bchannelid=(?P<channel_id>(?:[A-Za-z0-9+/=]|%2B|%2F|%3D)+)
|
||||
).+?)\1''', webpage)
|
||||
if mobj is not None:
|
||||
info = {
|
||||
|
@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class GfycatIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?gfycat\.com/(?:ru/|ifr/|gifs/detail/)?(?P<id>[^-/?#]+)'
|
||||
_VALID_URL = r'https?://(?:(?:www|giant|thumbs)\.)?gfycat\.com/(?:ru/|ifr/|gifs/detail/)?(?P<id>[^-/?#\.]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://gfycat.com/DeadlyDecisiveGermanpinscher',
|
||||
'info_dict': {
|
||||
@ -53,6 +53,12 @@ class GfycatIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://gfycat.com/acceptablehappygoluckyharborporpoise-baseball',
|
||||
'only_matching': True
|
||||
}, {
|
||||
'url': 'https://thumbs.gfycat.com/acceptablehappygoluckyharborporpoise-size_restricted.gif',
|
||||
'only_matching': True
|
||||
}, {
|
||||
'url': 'https://giant.gfycat.com/acceptablehappygoluckyharborporpoise.mp4',
|
||||
'only_matching': True
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
@ -96,21 +96,31 @@ class GloboIE(InfoExtractor):
|
||||
video = self._download_json(
|
||||
'http://api.globovideos.com/videos/%s/playlist' % video_id,
|
||||
video_id)['videos'][0]
|
||||
if video.get('encrypted') is True:
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
|
||||
title = video['title']
|
||||
|
||||
formats = []
|
||||
subtitles = {}
|
||||
for resource in video['resources']:
|
||||
resource_id = resource.get('_id')
|
||||
resource_url = resource.get('url')
|
||||
if not resource_id or not resource_url:
|
||||
resource_type = resource.get('type')
|
||||
if not resource_url or (resource_type == 'media' and not resource_id) or resource_type not in ('subtitle', 'media'):
|
||||
continue
|
||||
|
||||
if resource_type == 'subtitle':
|
||||
subtitles.setdefault(resource.get('language') or 'por', []).append({
|
||||
'url': resource_url,
|
||||
})
|
||||
continue
|
||||
|
||||
security = self._download_json(
|
||||
'http://security.video.globo.com/videos/%s/hash' % video_id,
|
||||
video_id, 'Downloading security hash for %s' % resource_id, query={
|
||||
'player': 'flash',
|
||||
'version': '17.0.0.132',
|
||||
'player': 'desktop',
|
||||
'version': '5.19.1',
|
||||
'resource_id': resource_id,
|
||||
})
|
||||
|
||||
@ -123,18 +133,23 @@ class GloboIE(InfoExtractor):
|
||||
continue
|
||||
|
||||
hash_code = security_hash[:2]
|
||||
received_time = security_hash[2:12]
|
||||
received_random = security_hash[12:22]
|
||||
received_md5 = security_hash[22:]
|
||||
|
||||
sign_time = compat_str(int(received_time) + 86400)
|
||||
padding = '%010d' % random.randint(1, 10000000000)
|
||||
if hash_code in ('04', '14'):
|
||||
received_time = security_hash[3:13]
|
||||
received_md5 = security_hash[24:]
|
||||
hash_prefix = security_hash[:23]
|
||||
elif hash_code in ('02', '12', '03', '13'):
|
||||
received_time = security_hash[2:12]
|
||||
received_md5 = security_hash[22:]
|
||||
padding += '1'
|
||||
hash_prefix = '05' + security_hash[:22]
|
||||
|
||||
md5_data = (received_md5 + sign_time + padding + '0xFF01DD').encode()
|
||||
padded_sign_time = compat_str(int(received_time) + 86400) + padding
|
||||
md5_data = (received_md5 + padded_sign_time + '0xAC10FD').encode()
|
||||
signed_md5 = base64.urlsafe_b64encode(hashlib.md5(md5_data).digest()).decode().strip('=')
|
||||
signed_hash = hash_code + received_time + received_random + sign_time + padding + signed_md5
|
||||
signed_hash = hash_prefix + padded_sign_time + signed_md5
|
||||
signed_url = '%s?h=%s&k=html5&a=%s&u=%s' % (resource_url, signed_hash, 'F' if video.get('subscriber_only') else 'A', security.get('user') or '')
|
||||
|
||||
signed_url = '%s?h=%s&k=%s' % (resource_url, signed_hash, 'flash')
|
||||
if resource_id.endswith('m3u8') or resource_url.endswith('.m3u8'):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
signed_url, resource_id, 'mp4', entry_protocol='m3u8_native',
|
||||
@ -164,7 +179,8 @@ class GloboIE(InfoExtractor):
|
||||
'duration': duration,
|
||||
'uploader': uploader,
|
||||
'uploader_id': uploader_id,
|
||||
'formats': formats
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
|
||||
|
@ -40,8 +40,17 @@ class GoIE(AdobePassIE):
|
||||
'resource_id': 'Disney',
|
||||
}
|
||||
}
|
||||
_VALID_URL = r'https?://(?:(?:(?P<sub_domain>%s)\.)?go|(?P<sub_domain_2>disneynow))\.com/(?:(?:[^/]+/)*(?P<id>vdka\w+)|(?:[^/]+/)*(?P<display_id>[^/?#]+))'\
|
||||
% '|'.join(list(_SITE_INFO.keys()) + ['disneynow'])
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
(?:(?P<sub_domain>%s)\.)?go|
|
||||
(?P<sub_domain_2>abc|freeform|disneynow)
|
||||
)\.com/
|
||||
(?:
|
||||
(?:[^/]+/)*(?P<id>[Vv][Dd][Kk][Aa]\w+)|
|
||||
(?:[^/]+/)*(?P<display_id>[^/?\#]+)
|
||||
)
|
||||
''' % '|'.join(list(_SITE_INFO.keys()))
|
||||
_TESTS = [{
|
||||
'url': 'http://abc.go.com/shows/designated-survivor/video/most-recent/VDKA3807643',
|
||||
'info_dict': {
|
||||
@ -54,6 +63,7 @@ class GoIE(AdobePassIE):
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'This content is no longer available.',
|
||||
}, {
|
||||
'url': 'http://watchdisneyxd.go.com/doraemon',
|
||||
'info_dict': {
|
||||
@ -61,6 +71,34 @@ class GoIE(AdobePassIE):
|
||||
'id': 'SH55574025',
|
||||
},
|
||||
'playlist_mincount': 51,
|
||||
}, {
|
||||
'url': 'http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood',
|
||||
'info_dict': {
|
||||
'id': 'VDKA3609139',
|
||||
'ext': 'mp4',
|
||||
'title': 'This Guilty Blood',
|
||||
'description': 'md5:f18e79ad1c613798d95fdabfe96cd292',
|
||||
'age_limit': 14,
|
||||
},
|
||||
'params': {
|
||||
'geo_bypass_ip_block': '3.244.239.0/24',
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://abc.com/shows/the-rookie/episode-guide/season-02/03-the-bet',
|
||||
'info_dict': {
|
||||
'id': 'VDKA13435179',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Bet',
|
||||
'description': 'md5:c66de8ba2e92c6c5c113c3ade84ab404',
|
||||
'age_limit': 14,
|
||||
},
|
||||
'params': {
|
||||
'geo_bypass_ip_block': '3.244.239.0/24',
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding',
|
||||
'only_matching': True,
|
||||
@ -95,10 +133,13 @@ class GoIE(AdobePassIE):
|
||||
if not video_id or not site_info:
|
||||
webpage = self._download_webpage(url, display_id or video_id)
|
||||
video_id = self._search_regex(
|
||||
# There may be inner quotes, e.g. data-video-id="'VDKA3609139'"
|
||||
# from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood
|
||||
r'data-video-id=["\']*(VDKA\w+)', webpage, 'video id',
|
||||
default=video_id)
|
||||
(
|
||||
# There may be inner quotes, e.g. data-video-id="'VDKA3609139'"
|
||||
# from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood
|
||||
r'data-video-id=["\']*(VDKA\w+)',
|
||||
# https://abc.com/shows/the-rookie/episode-guide/season-02/03-the-bet
|
||||
r'\b(?:video)?id["\']\s*:\s*["\'](VDKA\w+)'
|
||||
), webpage, 'video id', default=video_id)
|
||||
if not site_info:
|
||||
brand = self._search_regex(
|
||||
(r'data-brand=\s*["\']\s*(\d+)',
|
||||
|
@ -1,149 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_HTTPError
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
parse_age_limit,
|
||||
parse_iso8601,
|
||||
)
|
||||
|
||||
|
||||
class Go90IE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?go90\.com/(?:videos|embed)/(?P<id>[0-9a-zA-Z]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.go90.com/videos/84BUqjLpf9D',
|
||||
'md5': 'efa7670dbbbf21a7b07b360652b24a32',
|
||||
'info_dict': {
|
||||
'id': '84BUqjLpf9D',
|
||||
'ext': 'mp4',
|
||||
'title': 'Daily VICE - Inside The Utah Coalition Against Pornography Convention',
|
||||
'description': 'VICE\'s Karley Sciortino meets with activists who discuss the state\'s strong anti-porn stance. Then, VICE Sports explains NFL contracts.',
|
||||
'timestamp': 1491868800,
|
||||
'upload_date': '20170411',
|
||||
'age_limit': 14,
|
||||
}
|
||||
}, {
|
||||
'url': 'https://www.go90.com/embed/261MflWkD3N',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_GEO_BYPASS = False
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
try:
|
||||
headers = self.geo_verification_headers()
|
||||
headers.update({
|
||||
'Content-Type': 'application/json; charset=utf-8',
|
||||
})
|
||||
video_data = self._download_json(
|
||||
'https://www.go90.com/api/view/items/' + video_id, video_id,
|
||||
headers=headers, data=b'{"client":"web","device_type":"pc"}')
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
|
||||
message = self._parse_json(e.cause.read().decode(), None)['error']['message']
|
||||
if 'region unavailable' in message:
|
||||
self.raise_geo_restricted(countries=['US'])
|
||||
raise ExtractorError(message, expected=True)
|
||||
raise
|
||||
|
||||
if video_data.get('requires_drm'):
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
main_video_asset = video_data['main_video_asset']
|
||||
|
||||
episode_number = int_or_none(video_data.get('episode_number'))
|
||||
series = None
|
||||
season = None
|
||||
season_id = None
|
||||
season_number = None
|
||||
for metadata in video_data.get('__children', {}).get('Item', {}).values():
|
||||
if metadata.get('type') == 'show':
|
||||
series = metadata.get('title')
|
||||
elif metadata.get('type') == 'season':
|
||||
season = metadata.get('title')
|
||||
season_id = metadata.get('id')
|
||||
season_number = int_or_none(metadata.get('season_number'))
|
||||
|
||||
title = episode = video_data.get('title') or series
|
||||
if series and series != title:
|
||||
title = '%s - %s' % (series, title)
|
||||
|
||||
thumbnails = []
|
||||
formats = []
|
||||
subtitles = {}
|
||||
for asset in video_data.get('assets'):
|
||||
if asset.get('id') == main_video_asset:
|
||||
for source in asset.get('sources', []):
|
||||
source_location = source.get('location')
|
||||
if not source_location:
|
||||
continue
|
||||
source_type = source.get('type')
|
||||
if source_type == 'hls':
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
source_location, video_id, 'mp4',
|
||||
'm3u8_native', m3u8_id='hls', fatal=False)
|
||||
for f in m3u8_formats:
|
||||
mobj = re.search(r'/hls-(\d+)-(\d+)K', f['url'])
|
||||
if mobj:
|
||||
height, tbr = mobj.groups()
|
||||
height = int_or_none(height)
|
||||
f.update({
|
||||
'height': f.get('height') or height,
|
||||
'width': f.get('width') or int_or_none(height / 9.0 * 16.0 if height else None),
|
||||
'tbr': f.get('tbr') or int_or_none(tbr),
|
||||
})
|
||||
formats.extend(m3u8_formats)
|
||||
elif source_type == 'dash':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
source_location, video_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'format_id': source.get('name'),
|
||||
'url': source_location,
|
||||
'width': int_or_none(source.get('width')),
|
||||
'height': int_or_none(source.get('height')),
|
||||
'tbr': int_or_none(source.get('bitrate')),
|
||||
})
|
||||
|
||||
for caption in asset.get('caption_metadata', []):
|
||||
caption_url = caption.get('source_url')
|
||||
if not caption_url:
|
||||
continue
|
||||
subtitles.setdefault(caption.get('language', 'en'), []).append({
|
||||
'url': caption_url,
|
||||
'ext': determine_ext(caption_url, 'vtt'),
|
||||
})
|
||||
elif asset.get('type') == 'image':
|
||||
asset_location = asset.get('location')
|
||||
if not asset_location:
|
||||
continue
|
||||
thumbnails.append({
|
||||
'url': asset_location,
|
||||
'width': int_or_none(asset.get('width')),
|
||||
'height': int_or_none(asset.get('height')),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'thumbnails': thumbnails,
|
||||
'description': video_data.get('short_description'),
|
||||
'like_count': int_or_none(video_data.get('like_count')),
|
||||
'timestamp': parse_iso8601(video_data.get('released_at')),
|
||||
'series': series,
|
||||
'episode': episode,
|
||||
'season': season,
|
||||
'season_id': season_id,
|
||||
'season_number': season_number,
|
||||
'episode_number': episode_number,
|
||||
'subtitles': subtitles,
|
||||
'age_limit': parse_age_limit(video_data.get('rating')),
|
||||
}
|
@ -1,33 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class HarkIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?hark\.com/clips/(?P<id>.+?)-.+'
|
||||
_TEST = {
|
||||
'url': 'http://www.hark.com/clips/mmbzyhkgny-obama-beyond-the-afghan-theater-we-only-target-al-qaeda-on-may-23-2013',
|
||||
'md5': '6783a58491b47b92c7c1af5a77d4cbee',
|
||||
'info_dict': {
|
||||
'id': 'mmbzyhkgny',
|
||||
'ext': 'mp3',
|
||||
'title': 'Obama: \'Beyond The Afghan Theater, We Only Target Al Qaeda\' on May 23, 2013',
|
||||
'description': 'President Barack Obama addressed the nation live on May 23, 2013 in a speech aimed at addressing counter-terrorism policies including the use of drone strikes, detainees at Guantanamo Bay prison facility, and American citizens who are terrorists.',
|
||||
'duration': 11,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
data = self._download_json(
|
||||
'http://www.hark.com/clips/%s.json' % video_id, video_id)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': data['url'],
|
||||
'title': data['name'],
|
||||
'description': data.get('description'),
|
||||
'thumbnail': data.get('image_original'),
|
||||
'duration': data.get('duration'),
|
||||
}
|
@ -105,8 +105,7 @@ class HeiseIE(InfoExtractor):
|
||||
webpage, default=None) or self._html_search_meta(
|
||||
'description', webpage)
|
||||
|
||||
kaltura_url = KalturaIE._extract_url(webpage)
|
||||
if kaltura_url:
|
||||
def _make_kaltura_result(kaltura_url):
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': smuggle_url(kaltura_url, {'source_url': url}),
|
||||
@ -115,6 +114,16 @@ class HeiseIE(InfoExtractor):
|
||||
'description': description,
|
||||
}
|
||||
|
||||
kaltura_url = KalturaIE._extract_url(webpage)
|
||||
if kaltura_url:
|
||||
return _make_kaltura_result(kaltura_url)
|
||||
|
||||
kaltura_id = self._search_regex(
|
||||
r'entry-id=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'kaltura id',
|
||||
default=None, group='id')
|
||||
if kaltura_id:
|
||||
return _make_kaltura_result('kaltura:2238431:%s' % kaltura_id)
|
||||
|
||||
yt_urls = YoutubeIE._extract_urls(webpage)
|
||||
if yt_urls:
|
||||
return self.playlist_from_matches(
|
||||
|
@ -118,6 +118,7 @@ class HotStarIE(HotStarBaseIE):
|
||||
if video_data.get('drmProtected'):
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
|
||||
headers = {'Referer': url}
|
||||
formats = []
|
||||
geo_restricted = False
|
||||
playback_sets = self._call_api_v2('h/v2/play', video_id)['playBackSets']
|
||||
@ -137,10 +138,11 @@ class HotStarIE(HotStarBaseIE):
|
||||
if 'package:hls' in tags or ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, video_id, 'mp4',
|
||||
entry_protocol='m3u8_native', m3u8_id='hls'))
|
||||
entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', headers=headers))
|
||||
elif 'package:dash' in tags or ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
format_url, video_id, mpd_id='dash'))
|
||||
format_url, video_id, mpd_id='dash', headers=headers))
|
||||
elif ext == 'f4m':
|
||||
# produce broken files
|
||||
pass
|
||||
@ -158,6 +160,9 @@ class HotStarIE(HotStarBaseIE):
|
||||
self.raise_geo_restricted(countries=['IN'])
|
||||
self._sort_formats(formats)
|
||||
|
||||
for f in formats:
|
||||
f.setdefault('http_headers', {}).update(headers)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
|
@ -1,85 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
get_element_by_id,
|
||||
remove_end,
|
||||
)
|
||||
|
||||
|
||||
class IconosquareIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:iconosquare\.com|statigr\.am)/p/(?P<id>[^/]+)'
|
||||
_TEST = {
|
||||
'url': 'http://statigr.am/p/522207370455279102_24101272',
|
||||
'md5': '6eb93b882a3ded7c378ee1d6884b1814',
|
||||
'info_dict': {
|
||||
'id': '522207370455279102_24101272',
|
||||
'ext': 'mp4',
|
||||
'title': 'Instagram photo by @aguynamedpatrick (Patrick Janelle)',
|
||||
'description': 'md5:644406a9ec27457ed7aa7a9ebcd4ce3d',
|
||||
'timestamp': 1376471991,
|
||||
'upload_date': '20130814',
|
||||
'uploader': 'aguynamedpatrick',
|
||||
'uploader_id': '24101272',
|
||||
'comment_count': int,
|
||||
'like_count': int,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
media = self._parse_json(
|
||||
get_element_by_id('mediaJson', webpage),
|
||||
video_id)
|
||||
|
||||
formats = [{
|
||||
'url': f['url'],
|
||||
'format_id': format_id,
|
||||
'width': int_or_none(f.get('width')),
|
||||
'height': int_or_none(f.get('height'))
|
||||
} for format_id, f in media['videos'].items()]
|
||||
self._sort_formats(formats)
|
||||
|
||||
title = remove_end(self._og_search_title(webpage), ' - via Iconosquare')
|
||||
|
||||
timestamp = int_or_none(media.get('created_time') or media.get('caption', {}).get('created_time'))
|
||||
description = media.get('caption', {}).get('text')
|
||||
|
||||
uploader = media.get('user', {}).get('username')
|
||||
uploader_id = media.get('user', {}).get('id')
|
||||
|
||||
comment_count = int_or_none(media.get('comments', {}).get('count'))
|
||||
like_count = int_or_none(media.get('likes', {}).get('count'))
|
||||
|
||||
thumbnails = [{
|
||||
'url': t['url'],
|
||||
'id': thumbnail_id,
|
||||
'width': int_or_none(t.get('width')),
|
||||
'height': int_or_none(t.get('height'))
|
||||
} for thumbnail_id, t in media.get('images', {}).items()]
|
||||
|
||||
comments = [{
|
||||
'id': comment.get('id'),
|
||||
'text': comment['text'],
|
||||
'timestamp': int_or_none(comment.get('created_time')),
|
||||
'author': comment.get('from', {}).get('full_name'),
|
||||
'author_id': comment.get('from', {}).get('username'),
|
||||
} for comment in media.get('comments', {}).get('data', []) if 'text' in comment]
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnails': thumbnails,
|
||||
'timestamp': timestamp,
|
||||
'uploader': uploader,
|
||||
'uploader_id': uploader_id,
|
||||
'comment_count': comment_count,
|
||||
'like_count': like_count,
|
||||
'formats': formats,
|
||||
'comments': comments,
|
||||
}
|
@ -22,7 +22,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class InstagramIE(InfoExtractor):
|
||||
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/p/(?P<id>[^/?#&]+))'
|
||||
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com/(?:p|tv)/(?P<id>[^/?#&]+))'
|
||||
_TESTS = [{
|
||||
'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',
|
||||
'md5': '0d2da106a9d2631273e192b372806516',
|
||||
@ -92,6 +92,9 @@ class InstagramIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://instagram.com/p/9o6LshA7zy/embed/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.instagram.com/tv/aye83DjauH/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
|
@ -1,15 +1,13 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
xpath_text,
|
||||
)
|
||||
|
||||
|
||||
class InternetVideoArchiveIE(InfoExtractor):
|
||||
@ -20,7 +18,7 @@ class InternetVideoArchiveIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '194487',
|
||||
'ext': 'mp4',
|
||||
'title': 'KICK-ASS 2',
|
||||
'title': 'Kick-Ass 2',
|
||||
'description': 'md5:c189d5b7280400630a1d3dd17eaa8d8a',
|
||||
},
|
||||
'params': {
|
||||
@ -33,68 +31,34 @@ class InternetVideoArchiveIE(InfoExtractor):
|
||||
def _build_json_url(query):
|
||||
return 'http://video.internetvideoarchive.net/player/6/configuration.ashx?' + query
|
||||
|
||||
@staticmethod
|
||||
def _build_xml_url(query):
|
||||
return 'http://video.internetvideoarchive.net/flash/players/flashconfiguration.aspx?' + query
|
||||
|
||||
def _real_extract(self, url):
|
||||
query = compat_urlparse.urlparse(url).query
|
||||
query_dic = compat_parse_qs(query)
|
||||
video_id = query_dic['publishedid'][0]
|
||||
|
||||
if '/player/' in url:
|
||||
configuration = self._download_json(url, video_id)
|
||||
|
||||
# There are multiple videos in the playlist whlie only the first one
|
||||
# matches the video played in browsers
|
||||
video_info = configuration['playlist'][0]
|
||||
title = video_info['title']
|
||||
|
||||
formats = []
|
||||
for source in video_info['sources']:
|
||||
file_url = source['file']
|
||||
if determine_ext(file_url) == 'm3u8':
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
file_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
|
||||
if m3u8_formats:
|
||||
formats.extend(m3u8_formats)
|
||||
file_url = m3u8_formats[0]['url']
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
file_url.replace('.m3u8', '.f4m'),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
file_url.replace('.m3u8', '.mpd'),
|
||||
video_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
a_format = {
|
||||
'url': file_url,
|
||||
}
|
||||
|
||||
if source.get('label') and source['label'][-4:] == ' kbs':
|
||||
tbr = int_or_none(source['label'][:-4])
|
||||
a_format.update({
|
||||
'tbr': tbr,
|
||||
'format_id': 'http-%d' % tbr,
|
||||
})
|
||||
formats.append(a_format)
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
description = video_info.get('description')
|
||||
thumbnail = video_info.get('image')
|
||||
else:
|
||||
configuration = self._download_xml(url, video_id)
|
||||
formats = [{
|
||||
'url': xpath_text(configuration, './file', 'file URL', fatal=True),
|
||||
}]
|
||||
thumbnail = xpath_text(configuration, './image', 'thumbnail')
|
||||
title = 'InternetVideoArchive video %s' % video_id
|
||||
description = None
|
||||
query = compat_parse_qs(compat_urlparse.urlparse(url).query)
|
||||
video_id = query['publishedid'][0]
|
||||
data = self._download_json(
|
||||
'https://video.internetvideoarchive.net/videojs7/videojs7.ivasettings.ashx',
|
||||
video_id, data=json.dumps({
|
||||
'customerid': query['customerid'][0],
|
||||
'publishedid': video_id,
|
||||
}).encode())
|
||||
title = data['Title']
|
||||
formats = self._extract_m3u8_formats(
|
||||
data['VideoUrl'], video_id, 'mp4',
|
||||
'm3u8_native', m3u8_id='hls', fatal=False)
|
||||
file_url = formats[0]['url']
|
||||
if '.ism/' in file_url:
|
||||
replace_url = lambda x: re.sub(r'\.ism/[^?]+', '.ism/' + x, file_url)
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
replace_url('.f4m'), video_id, f4m_id='hds', fatal=False))
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
replace_url('.mpd'), video_id, mpd_id='dash', fatal=False))
|
||||
formats.extend(self._extract_ism_formats(
|
||||
replace_url('Manifest'), video_id, ism_id='mss', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'thumbnail': thumbnail,
|
||||
'description': description,
|
||||
'thumbnail': data.get('PosterUrl'),
|
||||
'description': data.get('Description'),
|
||||
}
|
||||
|
@ -18,6 +18,8 @@ class IviIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?ivi\.(?:ru|tv)/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)'
|
||||
_GEO_BYPASS = False
|
||||
_GEO_COUNTRIES = ['RU']
|
||||
_LIGHT_KEY = b'\xf1\x02\x32\xb7\xbc\x5c\x7a\xe8\xf7\x96\xc1\x33\x2b\x27\xa1\x8c'
|
||||
_LIGHT_URL = 'https://api.ivi.ru/light/'
|
||||
|
||||
_TESTS = [
|
||||
# Single movie
|
||||
@ -80,48 +82,77 @@ class IviIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
data = {
|
||||
data = json.dumps({
|
||||
'method': 'da.content.get',
|
||||
'params': [
|
||||
video_id, {
|
||||
'site': 's183',
|
||||
'site': 's%d',
|
||||
'referrer': 'http://www.ivi.ru/watch/%s' % video_id,
|
||||
'contentid': video_id
|
||||
}
|
||||
]
|
||||
}
|
||||
}).encode()
|
||||
|
||||
try:
|
||||
from Crypto.Cipher import Blowfish
|
||||
from Crypto.Hash import CMAC
|
||||
|
||||
timestamp = self._download_json(
|
||||
self._LIGHT_URL, video_id,
|
||||
'Downloading timestamp JSON', data=json.dumps({
|
||||
'method': 'da.timestamp.get',
|
||||
'params': []
|
||||
}).encode())['result']
|
||||
|
||||
data = data % 353
|
||||
query = {
|
||||
'ts': timestamp,
|
||||
'sign': CMAC.new(self._LIGHT_KEY, timestamp.encode() + data, Blowfish).hexdigest(),
|
||||
}
|
||||
except ImportError:
|
||||
data = data % 183
|
||||
query = {}
|
||||
|
||||
video_json = self._download_json(
|
||||
'http://api.digitalaccess.ru/api/json/', video_id,
|
||||
'Downloading video JSON', data=json.dumps(data))
|
||||
self._LIGHT_URL, video_id,
|
||||
'Downloading video JSON', data=data, query=query)
|
||||
|
||||
if 'error' in video_json:
|
||||
error = video_json['error']
|
||||
origin = error['origin']
|
||||
error = video_json.get('error')
|
||||
if error:
|
||||
origin = error.get('origin')
|
||||
message = error.get('message') or error.get('user_message')
|
||||
extractor_msg = 'Unable to download video %s'
|
||||
if origin == 'NotAllowedForLocation':
|
||||
self.raise_geo_restricted(
|
||||
msg=error['message'], countries=self._GEO_COUNTRIES)
|
||||
self.raise_geo_restricted(message, self._GEO_COUNTRIES)
|
||||
elif origin == 'NoRedisValidData':
|
||||
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
|
||||
raise ExtractorError(
|
||||
'Unable to download video %s: %s' % (video_id, error['message']),
|
||||
expected=True)
|
||||
extractor_msg = 'Video %s does not exist'
|
||||
elif message:
|
||||
if 'недоступен для просмотра на площадке s183' in message:
|
||||
raise ExtractorError(
|
||||
'pycryptodome not found. Please install it.',
|
||||
expected=True)
|
||||
extractor_msg += ': ' + message
|
||||
raise ExtractorError(extractor_msg % video_id, expected=True)
|
||||
|
||||
result = video_json['result']
|
||||
title = result['title']
|
||||
|
||||
quality = qualities(self._KNOWN_FORMATS)
|
||||
|
||||
formats = [{
|
||||
'url': x['url'],
|
||||
'format_id': x.get('content_format'),
|
||||
'quality': quality(x.get('content_format')),
|
||||
} for x in result['files'] if x.get('url')]
|
||||
|
||||
formats = []
|
||||
for f in result.get('files', []):
|
||||
f_url = f.get('url')
|
||||
content_format = f.get('content_format')
|
||||
if not f_url or '-MDRM-' in content_format or '-FPS-' in content_format:
|
||||
continue
|
||||
formats.append({
|
||||
'url': f_url,
|
||||
'format_id': content_format,
|
||||
'quality': quality(content_format),
|
||||
'filesize': int_or_none(f.get('size_in_bytes')),
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
title = result['title']
|
||||
|
||||
duration = int_or_none(result.get('duration'))
|
||||
compilation = result.get('compilation')
|
||||
episode = title if compilation else None
|
||||
|
||||
@ -158,7 +189,7 @@ class IviIE(InfoExtractor):
|
||||
'episode_number': episode_number,
|
||||
'thumbnails': thumbnails,
|
||||
'description': description,
|
||||
'duration': duration,
|
||||
'duration': int_or_none(result.get('duration')),
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
@ -1,38 +1,26 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import hashlib
|
||||
import random
|
||||
|
||||
from ..compat import compat_urlparse
|
||||
from ..compat import compat_str
|
||||
from .common import InfoExtractor
|
||||
from ..utils import parse_duration
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
int_or_none,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
class JamendoBaseIE(InfoExtractor):
|
||||
def _extract_meta(self, webpage, fatal=True):
|
||||
title = self._og_search_title(
|
||||
webpage, default=None) or self._search_regex(
|
||||
r'<title>([^<]+)', webpage,
|
||||
'title', default=None)
|
||||
if title:
|
||||
title = self._search_regex(
|
||||
r'(.+?)\s*\|\s*Jamendo Music', title, 'title', default=None)
|
||||
if not title:
|
||||
title = self._html_search_meta(
|
||||
'name', webpage, 'title', fatal=fatal)
|
||||
mobj = re.search(r'(.+) - (.+)', title or '')
|
||||
artist, second = mobj.groups() if mobj else [None] * 2
|
||||
return title, artist, second
|
||||
|
||||
|
||||
class JamendoIE(JamendoBaseIE):
|
||||
class JamendoIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
licensing\.jamendo\.com/[^/]+|
|
||||
(?:www\.)?jamendo\.com
|
||||
)
|
||||
/track/(?P<id>[0-9]+)/(?P<display_id>[^/?#&]+)
|
||||
/track/(?P<id>[0-9]+)(?:/(?P<display_id>[^/?#&]+))?
|
||||
'''
|
||||
_TESTS = [{
|
||||
'url': 'https://www.jamendo.com/track/196219/stories-from-emona-i',
|
||||
@ -45,7 +33,9 @@ class JamendoIE(JamendoBaseIE):
|
||||
'artist': 'Maya Filipič',
|
||||
'track': 'Stories from Emona I',
|
||||
'duration': 210,
|
||||
'thumbnail': r're:^https?://.*\.jpg'
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'timestamp': 1217438117,
|
||||
'upload_date': '20080730',
|
||||
}
|
||||
}, {
|
||||
'url': 'https://licensing.jamendo.com/en/track/1496667/energetic-rock',
|
||||
@ -53,15 +43,20 @@ class JamendoIE(JamendoBaseIE):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = self._VALID_URL_RE.match(url)
|
||||
track_id = mobj.group('id')
|
||||
display_id = mobj.group('display_id')
|
||||
|
||||
track_id, display_id = self._VALID_URL_RE.match(url).groups()
|
||||
webpage = self._download_webpage(
|
||||
'https://www.jamendo.com/track/%s/%s' % (track_id, display_id),
|
||||
display_id)
|
||||
|
||||
title, artist, track = self._extract_meta(webpage)
|
||||
'https://www.jamendo.com/track/' + track_id, track_id)
|
||||
models = self._parse_json(self._html_search_regex(
|
||||
r"data-bundled-models='([^']+)",
|
||||
webpage, 'bundled models'), track_id)
|
||||
track = models['track']['models'][0]
|
||||
title = track_name = track['name']
|
||||
get_model = lambda x: try_get(models, lambda y: y[x]['models'][0], dict) or {}
|
||||
artist = get_model('artist')
|
||||
artist_name = artist.get('name')
|
||||
if artist_name:
|
||||
title = '%s - %s' % (artist_name, title)
|
||||
album = get_model('album')
|
||||
|
||||
formats = [{
|
||||
'url': 'https://%s.jamendo.com/?trackid=%s&format=%s&from=app-97dab294'
|
||||
@ -77,31 +72,58 @@ class JamendoIE(JamendoBaseIE):
|
||||
))]
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnail = self._html_search_meta(
|
||||
'image', webpage, 'thumbnail', fatal=False)
|
||||
duration = parse_duration(self._search_regex(
|
||||
r'<span[^>]+itemprop=["\']duration["\'][^>]+content=["\'](.+?)["\']',
|
||||
webpage, 'duration', fatal=False))
|
||||
urls = []
|
||||
thumbnails = []
|
||||
for _, covers in track.get('cover', {}).items():
|
||||
for cover_id, cover_url in covers.items():
|
||||
if not cover_url or cover_url in urls:
|
||||
continue
|
||||
urls.append(cover_url)
|
||||
size = int_or_none(cover_id.lstrip('size'))
|
||||
thumbnails.append({
|
||||
'id': cover_id,
|
||||
'url': cover_url,
|
||||
'width': size,
|
||||
'height': size,
|
||||
})
|
||||
|
||||
tags = []
|
||||
for tag in track.get('tags', []):
|
||||
tag_name = tag.get('name')
|
||||
if not tag_name:
|
||||
continue
|
||||
tags.append(tag_name)
|
||||
|
||||
stats = track.get('stats') or {}
|
||||
|
||||
return {
|
||||
'id': track_id,
|
||||
'display_id': display_id,
|
||||
'thumbnail': thumbnail,
|
||||
'thumbnails': thumbnails,
|
||||
'title': title,
|
||||
'duration': duration,
|
||||
'artist': artist,
|
||||
'track': track,
|
||||
'formats': formats
|
||||
'description': track.get('description'),
|
||||
'duration': int_or_none(track.get('duration')),
|
||||
'artist': artist_name,
|
||||
'track': track_name,
|
||||
'album': album.get('name'),
|
||||
'formats': formats,
|
||||
'license': '-'.join(track.get('licenseCC', [])) or None,
|
||||
'timestamp': int_or_none(track.get('dateCreated')),
|
||||
'view_count': int_or_none(stats.get('listenedAll')),
|
||||
'like_count': int_or_none(stats.get('favorited')),
|
||||
'average_rating': int_or_none(stats.get('averageNote')),
|
||||
'tags': tags,
|
||||
}
|
||||
|
||||
|
||||
class JamendoAlbumIE(JamendoBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?jamendo\.com/album/(?P<id>[0-9]+)/(?P<display_id>[\w-]+)'
|
||||
class JamendoAlbumIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?jamendo\.com/album/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.jamendo.com/album/121486/duck-on-cover',
|
||||
'info_dict': {
|
||||
'id': '121486',
|
||||
'title': 'Shearer - Duck On Cover'
|
||||
'title': 'Duck On Cover',
|
||||
'description': 'md5:c2920eaeef07d7af5b96d7c64daf1239',
|
||||
},
|
||||
'playlist': [{
|
||||
'md5': 'e1a2fcb42bda30dfac990212924149a8',
|
||||
@ -111,6 +133,8 @@ class JamendoAlbumIE(JamendoBaseIE):
|
||||
'title': 'Shearer - Warmachine',
|
||||
'artist': 'Shearer',
|
||||
'track': 'Warmachine',
|
||||
'timestamp': 1368089771,
|
||||
'upload_date': '20130509',
|
||||
}
|
||||
}, {
|
||||
'md5': '1f358d7b2f98edfe90fd55dac0799d50',
|
||||
@ -120,6 +144,8 @@ class JamendoAlbumIE(JamendoBaseIE):
|
||||
'title': 'Shearer - Without Your Ghost',
|
||||
'artist': 'Shearer',
|
||||
'track': 'Without Your Ghost',
|
||||
'timestamp': 1368089771,
|
||||
'upload_date': '20130509',
|
||||
}
|
||||
}],
|
||||
'params': {
|
||||
@ -127,24 +153,35 @@ class JamendoAlbumIE(JamendoBaseIE):
|
||||
}
|
||||
}
|
||||
|
||||
def _call_api(self, resource, resource_id):
|
||||
path = '/api/%ss' % resource
|
||||
rand = compat_str(random.random())
|
||||
return self._download_json(
|
||||
'https://www.jamendo.com' + path, resource_id, query={
|
||||
'id[]': resource_id,
|
||||
}, headers={
|
||||
'X-Jam-Call': '$%s*%s~' % (hashlib.sha1((path + rand).encode()).hexdigest(), rand)
|
||||
})[0]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = self._VALID_URL_RE.match(url)
|
||||
album_id = mobj.group('id')
|
||||
album_id = self._match_id(url)
|
||||
album = self._call_api('album', album_id)
|
||||
album_name = album.get('name')
|
||||
|
||||
webpage = self._download_webpage(url, mobj.group('display_id'))
|
||||
entries = []
|
||||
for track in album.get('tracks', []):
|
||||
track_id = track.get('id')
|
||||
if not track_id:
|
||||
continue
|
||||
track_id = compat_str(track_id)
|
||||
entries.append({
|
||||
'_type': 'url_transparent',
|
||||
'url': 'https://www.jamendo.com/track/' + track_id,
|
||||
'ie_key': JamendoIE.ie_key(),
|
||||
'id': track_id,
|
||||
'album': album_name,
|
||||
})
|
||||
|
||||
title, artist, album = self._extract_meta(webpage, fatal=False)
|
||||
|
||||
entries = [{
|
||||
'_type': 'url_transparent',
|
||||
'url': compat_urlparse.urljoin(url, m.group('path')),
|
||||
'ie_key': JamendoIE.ie_key(),
|
||||
'id': self._search_regex(
|
||||
r'/track/(\d+)', m.group('path'), 'track id', default=None),
|
||||
'artist': artist,
|
||||
'album': album,
|
||||
} for m in re.finditer(
|
||||
r'<a[^>]+href=(["\'])(?P<path>(?:(?!\1).)+)\1[^>]+class=["\'][^>]*js-trackrow-albumpage-link',
|
||||
webpage)]
|
||||
|
||||
return self.playlist_result(entries, album_id, title)
|
||||
return self.playlist_result(
|
||||
entries, album_id, album_name,
|
||||
clean_html(try_get(album, lambda x: x['description']['en'], compat_str)))
|
||||
|
@ -7,7 +7,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class JWPlatformIE(InfoExtractor):
|
||||
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview|video)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
|
||||
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
|
||||
_TESTS = [{
|
||||
'url': 'http://content.jwplatform.com/players/nPripu9l-ALJ3XQCI.js',
|
||||
'md5': 'fa8899fa601eb7c83a64e9d568bdf325',
|
||||
|
@ -6,14 +6,15 @@ from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
strip_or_none,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
class KakaoIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://tv\.kakao\.com/channel/(?P<channel>\d+)/cliplink/(?P<id>\d+)'
|
||||
_API_BASE = 'http://tv.kakao.com/api/v1/ft/cliplinks'
|
||||
_VALID_URL = r'https?://(?:play-)?tv\.kakao\.com/(?:channel/\d+|embed/player)/cliplink/(?P<id>\d+|[^?#&]+@my)'
|
||||
_API_BASE_TMPL = 'http://tv.kakao.com/api/v1/ft/cliplinks/%s/'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://tv.kakao.com/channel/2671005/cliplink/301965083',
|
||||
@ -36,7 +37,7 @@ class KakaoIE(InfoExtractor):
|
||||
'description': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)\r\n\r\n[쇼! 음악중심] 20160611, 507회',
|
||||
'title': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)',
|
||||
'uploader_id': 2653210,
|
||||
'uploader': '쇼 음악중심',
|
||||
'uploader': '쇼! 음악중심',
|
||||
'timestamp': 1485684628,
|
||||
'upload_date': '20170129',
|
||||
}
|
||||
@ -44,6 +45,8 @@ class KakaoIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
display_id = video_id.rstrip('@my')
|
||||
api_base = self._API_BASE_TMPL % video_id
|
||||
|
||||
player_header = {
|
||||
'Referer': update_url_query(
|
||||
@ -55,20 +58,23 @@ class KakaoIE(InfoExtractor):
|
||||
})
|
||||
}
|
||||
|
||||
QUERY_COMMON = {
|
||||
query = {
|
||||
'player': 'monet_html5',
|
||||
'referer': url,
|
||||
'uuid': '',
|
||||
'service': 'kakao_tv',
|
||||
'section': '',
|
||||
'dteType': 'PC',
|
||||
'fields': ','.join([
|
||||
'-*', 'tid', 'clipLink', 'displayTitle', 'clip', 'title',
|
||||
'description', 'channelId', 'createTime', 'duration', 'playCount',
|
||||
'likeCount', 'commentCount', 'tagList', 'channel', 'name',
|
||||
'clipChapterThumbnailList', 'thumbnailUrl', 'timeInSec', 'isDefault',
|
||||
'videoOutputList', 'width', 'height', 'kbps', 'profile', 'label'])
|
||||
}
|
||||
|
||||
query = QUERY_COMMON.copy()
|
||||
query['fields'] = 'clipLink,clip,channel,hasPlusFriend,-service,-tagList'
|
||||
impress = self._download_json(
|
||||
'%s/%s/impress' % (self._API_BASE, video_id),
|
||||
video_id, 'Downloading video info',
|
||||
api_base + 'impress', display_id, 'Downloading video info',
|
||||
query=query, headers=player_header)
|
||||
|
||||
clip_link = impress['clipLink']
|
||||
@ -76,32 +82,22 @@ class KakaoIE(InfoExtractor):
|
||||
|
||||
title = clip.get('title') or clip_link.get('displayTitle')
|
||||
|
||||
tid = impress.get('tid', '')
|
||||
|
||||
query = QUERY_COMMON.copy()
|
||||
query.update({
|
||||
'tid': tid,
|
||||
'profile': 'HIGH',
|
||||
})
|
||||
raw = self._download_json(
|
||||
'%s/%s/raw' % (self._API_BASE, video_id),
|
||||
video_id, 'Downloading video formats info',
|
||||
query=query, headers=player_header)
|
||||
query['tid'] = impress.get('tid', '')
|
||||
|
||||
formats = []
|
||||
for fmt in raw.get('outputList', []):
|
||||
for fmt in clip.get('videoOutputList', []):
|
||||
try:
|
||||
profile_name = fmt['profile']
|
||||
if profile_name == 'AUDIO':
|
||||
continue
|
||||
query.update({
|
||||
'profile': profile_name,
|
||||
'fields': '-*,url',
|
||||
})
|
||||
fmt_url_json = self._download_json(
|
||||
'%s/%s/raw/videolocation' % (self._API_BASE, video_id),
|
||||
video_id,
|
||||
api_base + 'raw/videolocation', display_id,
|
||||
'Downloading video URL for profile %s' % profile_name,
|
||||
query={
|
||||
'service': 'kakao_tv',
|
||||
'section': '',
|
||||
'tid': tid,
|
||||
'profile': profile_name
|
||||
}, headers=player_header, fatal=False)
|
||||
query=query, headers=player_header, fatal=False)
|
||||
|
||||
if fmt_url_json is None:
|
||||
continue
|
||||
@ -113,7 +109,8 @@ class KakaoIE(InfoExtractor):
|
||||
'width': int_or_none(fmt.get('width')),
|
||||
'height': int_or_none(fmt.get('height')),
|
||||
'format_note': fmt.get('label'),
|
||||
'filesize': int_or_none(fmt.get('filesize'))
|
||||
'filesize': int_or_none(fmt.get('filesize')),
|
||||
'tbr': int_or_none(fmt.get('kbps')),
|
||||
})
|
||||
except KeyError:
|
||||
pass
|
||||
@ -134,9 +131,9 @@ class KakaoIE(InfoExtractor):
|
||||
})
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'id': display_id,
|
||||
'title': title,
|
||||
'description': clip.get('description'),
|
||||
'description': strip_or_none(clip.get('description')),
|
||||
'uploader': clip_link.get('channel', {}).get('name'),
|
||||
'uploader_id': clip_link.get('channelId'),
|
||||
'thumbnails': thumbs,
|
||||
@ -146,4 +143,5 @@ class KakaoIE(InfoExtractor):
|
||||
'like_count': int_or_none(clip.get('likeCount')),
|
||||
'comment_count': int_or_none(clip.get('commentCount')),
|
||||
'formats': formats,
|
||||
'tags': clip.get('tagList'),
|
||||
}
|
||||
|
@ -151,14 +151,15 @@ class KalturaIE(InfoExtractor):
|
||||
if mobj:
|
||||
embed_info = mobj.groupdict()
|
||||
for k, v in embed_info.items():
|
||||
embed_info[k] = v.strip()
|
||||
if v:
|
||||
embed_info[k] = v.strip()
|
||||
url = 'kaltura:%(partner_id)s:%(id)s' % embed_info
|
||||
escaped_pid = re.escape(embed_info['partner_id'])
|
||||
service_url = re.search(
|
||||
r'<script[^>]+src=["\']((?:https?:)?//.+?)/p/%s/sp/%s00/embedIframeJs' % (escaped_pid, escaped_pid),
|
||||
service_mobj = re.search(
|
||||
r'<script[^>]+src=(["\'])(?P<id>(?:https?:)?//(?:(?!\1).)+)/p/%s/sp/%s00/embedIframeJs' % (escaped_pid, escaped_pid),
|
||||
webpage)
|
||||
if service_url:
|
||||
url = smuggle_url(url, {'service_url': service_url.group(1)})
|
||||
if service_mobj:
|
||||
url = smuggle_url(url, {'service_url': service_mobj.group('id')})
|
||||
return url
|
||||
|
||||
def _kaltura_api_call(self, video_id, actions, service_url=None, *args, **kwargs):
|
||||
|
@ -1,39 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class KeekIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?keek\.com/keek/(?P<id>\w+)'
|
||||
IE_NAME = 'keek'
|
||||
_TEST = {
|
||||
'url': 'https://www.keek.com/keek/NODfbab',
|
||||
'md5': '9b0636f8c0f7614afa4ea5e4c6e57e83',
|
||||
'info_dict': {
|
||||
'id': 'NODfbab',
|
||||
'ext': 'mp4',
|
||||
'title': 'md5:35d42050a3ece241d5ddd7fdcc6fd896',
|
||||
'uploader': 'ytdl',
|
||||
'uploader_id': 'eGT5bab',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': self._og_search_video_url(webpage),
|
||||
'ext': 'mp4',
|
||||
'title': self._og_search_description(webpage).strip(),
|
||||
'thumbnail': self._og_search_thumbnail(webpage),
|
||||
'uploader': self._search_regex(
|
||||
r'data-username=(["\'])(?P<uploader>.+?)\1', webpage,
|
||||
'uploader', fatal=False, group='uploader'),
|
||||
'uploader_id': self._search_regex(
|
||||
r'data-user-id=(["\'])(?P<uploader_id>.+?)\1', webpage,
|
||||
'uploader id', fatal=False, group='uploader_id'),
|
||||
}
|
221
youtube_dl/extractor/kinja.py
Normal file
221
youtube_dl/extractor/kinja.py
Normal file
@ -0,0 +1,221 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
compat_urllib_parse_unquote,
|
||||
)
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
strip_or_none,
|
||||
try_get,
|
||||
unescapeHTML,
|
||||
urljoin,
|
||||
)
|
||||
|
||||
|
||||
class KinjaEmbedIE(InfoExtractor):
|
||||
IENAME = 'kinja:embed'
|
||||
_DOMAIN_REGEX = r'''(?:[^.]+\.)?
|
||||
(?:
|
||||
avclub|
|
||||
clickhole|
|
||||
deadspin|
|
||||
gizmodo|
|
||||
jalopnik|
|
||||
jezebel|
|
||||
kinja|
|
||||
kotaku|
|
||||
lifehacker|
|
||||
splinternews|
|
||||
the(?:inventory|onion|root|takeout)
|
||||
)\.com'''
|
||||
_COMMON_REGEX = r'''/
|
||||
(?:
|
||||
ajax/inset|
|
||||
embed/video
|
||||
)/iframe\?.*?\bid='''
|
||||
_VALID_URL = r'''(?x)https?://%s%s
|
||||
(?P<type>
|
||||
fb|
|
||||
imgur|
|
||||
instagram|
|
||||
jwp(?:layer)?-video|
|
||||
kinjavideo|
|
||||
mcp|
|
||||
megaphone|
|
||||
ooyala|
|
||||
soundcloud(?:-playlist)?|
|
||||
tumblr-post|
|
||||
twitch-stream|
|
||||
twitter|
|
||||
ustream-channel|
|
||||
vimeo|
|
||||
vine|
|
||||
youtube-(?:list|video)
|
||||
)-(?P<id>[^&]+)''' % (_DOMAIN_REGEX, _COMMON_REGEX)
|
||||
_TESTS = [{
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=fb-10103303356633621',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=kinjavideo-100313',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=megaphone-PPY1300931075',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=ooyala-xzMXhleDpopuT0u1ijt_qZj3Va-34pEX%2FZTIxYmJjZDM2NWYzZDViZGRiOWJjYzc5',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=soundcloud-128574047',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=soundcloud-playlist-317413750',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=tumblr-post-160130699814-daydreams-at-midnight',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=twitch-stream-libratus_extra',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=twitter-1068875942473404422',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=ustream-channel-10414700',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=vimeo-120153502',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=vine-5BlvV5qqPrD',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=youtube-list-BCQ3KyrPjgA/PLE6509247C270A72E',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://kinja.com/ajax/inset/iframe?id=youtube-video-00QyL0AgPAE',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_JWPLATFORM_PROVIDER = ('cdn.jwplayer.com/v2/media/', 'JWPlatform')
|
||||
_PROVIDER_MAP = {
|
||||
'fb': ('facebook.com/video.php?v=', 'Facebook'),
|
||||
'imgur': ('imgur.com/', 'Imgur'),
|
||||
'instagram': ('instagram.com/p/', 'Instagram'),
|
||||
'jwplayer-video': _JWPLATFORM_PROVIDER,
|
||||
'jwp-video': _JWPLATFORM_PROVIDER,
|
||||
'megaphone': ('player.megaphone.fm/', 'Generic'),
|
||||
'ooyala': ('player.ooyala.com/player.js?embedCode=', 'Ooyala'),
|
||||
'soundcloud': ('api.soundcloud.com/tracks/', 'Soundcloud'),
|
||||
'soundcloud-playlist': ('api.soundcloud.com/playlists/', 'SoundcloudPlaylist'),
|
||||
'tumblr-post': ('%s.tumblr.com/post/%s', 'Tumblr'),
|
||||
'twitch-stream': ('twitch.tv/', 'TwitchStream'),
|
||||
'twitter': ('twitter.com/i/cards/tfw/v1/', 'TwitterCard'),
|
||||
'ustream-channel': ('ustream.tv/embed/', 'Ustream'),
|
||||
'vimeo': ('vimeo.com/', 'Vimeo'),
|
||||
'vine': ('vine.co/v/', 'Vine'),
|
||||
'youtube-list': ('youtube.com/embed/%s?list=%s', 'YoutubePlaylist'),
|
||||
'youtube-video': ('youtube.com/embed/', 'Youtube'),
|
||||
}
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage, url):
|
||||
return [urljoin(url, unescapeHTML(mobj.group('url'))) for mobj in re.finditer(
|
||||
r'(?x)<iframe[^>]+?src=(?P<q>["\'])(?P<url>(?:(?:https?:)?//%s)?%s(?:(?!\1).)+)\1' % (KinjaEmbedIE._DOMAIN_REGEX, KinjaEmbedIE._COMMON_REGEX),
|
||||
webpage)]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_type, video_id = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
provider = self._PROVIDER_MAP.get(video_type)
|
||||
if provider:
|
||||
video_id = compat_urllib_parse_unquote(video_id)
|
||||
if video_type == 'tumblr-post':
|
||||
video_id, blog = video_id.split('-', 1)
|
||||
result_url = provider[0] % (blog, video_id)
|
||||
elif video_type == 'youtube-list':
|
||||
video_id, playlist_id = video_id.split('/')
|
||||
result_url = provider[0] % (video_id, playlist_id)
|
||||
else:
|
||||
if video_type == 'ooyala':
|
||||
video_id = video_id.split('/')[0]
|
||||
result_url = provider[0] + video_id
|
||||
return self.url_result('http://' + result_url, provider[1])
|
||||
|
||||
if video_type == 'kinjavideo':
|
||||
data = self._download_json(
|
||||
'https://kinja.com/api/core/video/views/videoById',
|
||||
video_id, query={'videoId': video_id})['data']
|
||||
title = data['title']
|
||||
|
||||
formats = []
|
||||
for k in ('signedPlaylist', 'streaming'):
|
||||
m3u8_url = data.get(k + 'Url')
|
||||
if m3u8_url:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
|
||||
thumbnail = None
|
||||
poster = data.get('poster') or {}
|
||||
poster_id = poster.get('id')
|
||||
if poster_id:
|
||||
thumbnail = 'https://i.kinja-img.com/gawker-media/image/upload/%s.%s' % (poster_id, poster.get('format') or 'jpg')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': strip_or_none(data.get('description')),
|
||||
'formats': formats,
|
||||
'tags': data.get('tags'),
|
||||
'timestamp': int_or_none(try_get(
|
||||
data, lambda x: x['postInfo']['publishTimeMillis']), 1000),
|
||||
'thumbnail': thumbnail,
|
||||
'uploader': data.get('network'),
|
||||
}
|
||||
else:
|
||||
video_data = self._download_json(
|
||||
'https://api.vmh.univision.com/metadata/v1/content/' + video_id,
|
||||
video_id)['videoMetadata']
|
||||
iptc = video_data['photoVideoMetadataIPTC']
|
||||
title = iptc['title']['en']
|
||||
fmg = video_data.get('photoVideoMetadata_fmg') or {}
|
||||
tvss_domain = fmg.get('tvssDomain') or 'https://auth.univision.com'
|
||||
data = self._download_json(
|
||||
tvss_domain + '/api/v3/video-auth/url-signature-tokens',
|
||||
video_id, query={'mcpids': video_id})['data'][0]
|
||||
formats = []
|
||||
|
||||
rendition_url = data.get('renditionUrl')
|
||||
if rendition_url:
|
||||
formats = self._extract_m3u8_formats(
|
||||
rendition_url, video_id, 'mp4',
|
||||
'm3u8_native', m3u8_id='hls', fatal=False)
|
||||
|
||||
fallback_rendition_url = data.get('fallbackRenditionUrl')
|
||||
if fallback_rendition_url:
|
||||
formats.append({
|
||||
'format_id': 'fallback',
|
||||
'tbr': int_or_none(self._search_regex(
|
||||
r'_(\d+)\.mp4', fallback_rendition_url,
|
||||
'bitrate', default=None)),
|
||||
'url': fallback_rendition_url,
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': try_get(iptc, lambda x: x['cloudinaryLink']['link'], compat_str),
|
||||
'uploader': fmg.get('network'),
|
||||
'duration': int_or_none(iptc.get('fileDuration')),
|
||||
'formats': formats,
|
||||
'description': try_get(iptc, lambda x: x['description']['en'], compat_str),
|
||||
'timestamp': parse_iso8601(iptc.get('dateReleased')),
|
||||
}
|
@ -20,7 +20,7 @@ class LA7IE(InfoExtractor):
|
||||
'url': 'http://www.la7.it/crozza/video/inccool8-02-10-2015-163722',
|
||||
'md5': '8b613ffc0c4bf9b9e377169fc19c214c',
|
||||
'info_dict': {
|
||||
'id': 'inccool8-02-10-2015-163722',
|
||||
'id': '0_42j6wd36',
|
||||
'ext': 'mp4',
|
||||
'title': 'Inc.Cool8',
|
||||
'description': 'Benvenuti nell\'incredibile mondo della INC. COOL. 8. dove “INC.” sta per “Incorporated” “COOL” sta per “fashion” ed Eight sta per il gesto atletico',
|
||||
@ -57,7 +57,7 @@ class LA7IE(InfoExtractor):
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': smuggle_url('kaltura:103:%s' % player_data['vid'], {
|
||||
'service_url': 'http://kdam.iltrovatore.it',
|
||||
'service_url': 'http://nkdam.iltrovatore.it',
|
||||
}),
|
||||
'id': video_id,
|
||||
'title': player_data['title'],
|
||||
|
@ -1,33 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class LearnrIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?learnr\.pro/view/video/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.learnr.pro/view/video/51624-web-development-tutorial-for-beginners-1-how-to-build-webpages-with-html-css-javascript',
|
||||
'md5': '3719fdf0a68397f49899e82c308a89de',
|
||||
'info_dict': {
|
||||
'id': '51624',
|
||||
'ext': 'mp4',
|
||||
'title': 'Web Development Tutorial for Beginners (#1) - How to build webpages with HTML, CSS, Javascript',
|
||||
'description': 'md5:b36dbfa92350176cdf12b4d388485503',
|
||||
'uploader': 'LearnCode.academy',
|
||||
'uploader_id': 'learncodeacademy',
|
||||
'upload_date': '20131021',
|
||||
},
|
||||
'add_ie': ['Youtube'],
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': self._search_regex(
|
||||
r"videoId\s*:\s*'([^']+)'", webpage, 'youtube id'),
|
||||
'id': video_id,
|
||||
}
|
@ -5,24 +5,27 @@ import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
compat_str,
|
||||
int_or_none,
|
||||
unified_strdate,
|
||||
parse_iso8601,
|
||||
)
|
||||
|
||||
|
||||
class LnkGoIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?lnkgo\.(?:alfa\.)?lt/visi-video/(?P<show>[^/]+)/ziurek-(?P<id>[A-Za-z0-9-]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?lnk(?:go)?\.(?:alfa\.)?lt/(?:visi-video/[^/]+|video)/(?P<id>[A-Za-z0-9-]+)(?:/(?P<episode_id>\d+))?'
|
||||
_TESTS = [{
|
||||
'url': 'http://lnkgo.alfa.lt/visi-video/yra-kaip-yra/ziurek-yra-kaip-yra-162',
|
||||
'url': 'http://www.lnkgo.lt/visi-video/aktualai-pratesimas/ziurek-putka-trys-klausimai',
|
||||
'info_dict': {
|
||||
'id': '46712',
|
||||
'id': '10809',
|
||||
'ext': 'mp4',
|
||||
'title': 'Yra kaip yra',
|
||||
'upload_date': '20150107',
|
||||
'description': 'md5:d82a5e36b775b7048617f263a0e3475e',
|
||||
'age_limit': 7,
|
||||
'duration': 3019,
|
||||
'thumbnail': r're:^https?://.*\.jpg$'
|
||||
'title': "Put'ka: Trys Klausimai",
|
||||
'upload_date': '20161216',
|
||||
'description': 'Seniai matytas Put’ka užduoda tris klausimėlius. Pabandykime surasti atsakymus.',
|
||||
'age_limit': 18,
|
||||
'duration': 117,
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'timestamp': 1481904000,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # HLS download
|
||||
@ -30,20 +33,21 @@ class LnkGoIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://lnkgo.alfa.lt/visi-video/aktualai-pratesimas/ziurek-nerdas-taiso-kompiuteri-2',
|
||||
'info_dict': {
|
||||
'id': '47289',
|
||||
'id': '10467',
|
||||
'ext': 'mp4',
|
||||
'title': 'Nėrdas: Kompiuterio Valymas',
|
||||
'upload_date': '20150113',
|
||||
'description': 'md5:7352d113a242a808676ff17e69db6a69',
|
||||
'age_limit': 18,
|
||||
'duration': 346,
|
||||
'thumbnail': r're:^https?://.*\.jpg$'
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'timestamp': 1421164800,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # HLS download
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.lnkgo.lt/visi-video/aktualai-pratesimas/ziurek-putka-trys-klausimai',
|
||||
'url': 'https://lnk.lt/video/neigalieji-tv-bokste/37413',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_AGE_LIMITS = {
|
||||
@ -51,66 +55,34 @@ class LnkGoIE(InfoExtractor):
|
||||
'N-14': 14,
|
||||
'S': 18,
|
||||
}
|
||||
_M3U8_TEMPL = 'https://vod.lnk.lt/lnk_vod/lnk/lnk/%s:%s/playlist.m3u8%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
display_id, video_id = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
webpage = self._download_webpage(
|
||||
url, display_id, 'Downloading player webpage')
|
||||
|
||||
video_id = self._search_regex(
|
||||
r'data-ep="([^"]+)"', webpage, 'video ID')
|
||||
title = self._og_search_title(webpage)
|
||||
description = self._og_search_description(webpage)
|
||||
upload_date = unified_strdate(self._search_regex(
|
||||
r'class="[^"]*meta-item[^"]*air-time[^"]*">.*?<strong>([^<]+)</strong>', webpage, 'upload date', fatal=False))
|
||||
|
||||
thumbnail_w = int_or_none(
|
||||
self._og_search_property('image:width', webpage, 'thumbnail width', fatal=False))
|
||||
thumbnail_h = int_or_none(
|
||||
self._og_search_property('image:height', webpage, 'thumbnail height', fatal=False))
|
||||
thumbnail = {
|
||||
'url': self._og_search_thumbnail(webpage),
|
||||
}
|
||||
if thumbnail_w and thumbnail_h:
|
||||
thumbnail.update({
|
||||
'width': thumbnail_w,
|
||||
'height': thumbnail_h,
|
||||
})
|
||||
|
||||
config = self._parse_json(self._search_regex(
|
||||
r'episodePlayer\((\{.*?\}),\s*\{', webpage, 'sources'), video_id)
|
||||
|
||||
if config.get('pGeo'):
|
||||
self.report_warning(
|
||||
'This content might not be available in your country due to copyright reasons')
|
||||
|
||||
formats = [{
|
||||
'format_id': 'hls',
|
||||
'ext': 'mp4',
|
||||
'url': config['EpisodeVideoLink_HLS'],
|
||||
}]
|
||||
|
||||
m = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<play_path>.+)$', config['EpisodeVideoLink'])
|
||||
if m:
|
||||
formats.append({
|
||||
'format_id': 'rtmp',
|
||||
'ext': 'flv',
|
||||
'url': m.group('url'),
|
||||
'play_path': m.group('play_path'),
|
||||
'page_url': url,
|
||||
})
|
||||
video_info = self._download_json(
|
||||
'https://lnk.lt/api/main/video-page/%s/%s/false' % (display_id, video_id or '0'),
|
||||
display_id)['videoConfig']['videoInfo']
|
||||
|
||||
video_id = compat_str(video_info['id'])
|
||||
title = video_info['title']
|
||||
prefix = 'smil' if video_info.get('isQualityChangeAvailable') else 'mp4'
|
||||
formats = self._extract_m3u8_formats(
|
||||
self._M3U8_TEMPL % (prefix, video_info['videoUrl'], video_info.get('secureTokenParams') or ''),
|
||||
video_id, 'mp4', 'm3u8_native')
|
||||
self._sort_formats(formats)
|
||||
|
||||
poster_image = video_info.get('posterImage')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'thumbnails': [thumbnail],
|
||||
'duration': int_or_none(config.get('VideoTime')),
|
||||
'description': description,
|
||||
'age_limit': self._AGE_LIMITS.get(config.get('PGRating'), 0),
|
||||
'upload_date': upload_date,
|
||||
'thumbnail': 'https://lnk.lt/all-images/' + poster_image if poster_image else None,
|
||||
'duration': int_or_none(video_info.get('duration')),
|
||||
'description': clean_html(video_info.get('htmlDescription')),
|
||||
'age_limit': self._AGE_LIMITS.get(video_info.get('pgRating'), 0),
|
||||
'timestamp': parse_iso8601(video_info.get('airDate')),
|
||||
'view_count': int_or_none(video_info.get('viewsCount')),
|
||||
}
|
||||
|
@ -1,42 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import ExtractorError
|
||||
|
||||
|
||||
class MacGameStoreIE(InfoExtractor):
|
||||
IE_NAME = 'macgamestore'
|
||||
IE_DESC = 'MacGameStore trailers'
|
||||
_VALID_URL = r'https?://(?:www\.)?macgamestore\.com/mediaviewer\.php\?trailer=(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.macgamestore.com/mediaviewer.php?trailer=2450',
|
||||
'md5': '8649b8ea684b6666b4c5be736ecddc61',
|
||||
'info_dict': {
|
||||
'id': '2450',
|
||||
'ext': 'm4v',
|
||||
'title': 'Crow',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, 'Downloading trailer page')
|
||||
|
||||
if '>Missing Media<' in webpage:
|
||||
raise ExtractorError(
|
||||
'Trailer %s does not exist' % video_id, expected=True)
|
||||
|
||||
video_title = self._html_search_regex(
|
||||
r'<title>MacGameStore: (.*?) Trailer</title>', webpage, 'title')
|
||||
|
||||
video_url = self._html_search_regex(
|
||||
r'(?s)<div\s+id="video-player".*?href="([^"]+)"\s*>',
|
||||
webpage, 'video URL')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': video_title
|
||||
}
|
@ -1,32 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class MakerTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:(?:www\.)?maker\.tv/(?:[^/]+/)*video|makerplayer\.com/embed/maker)/(?P<id>[a-zA-Z0-9]{12})'
|
||||
_TEST = {
|
||||
'url': 'http://www.maker.tv/video/Fh3QgymL9gsc',
|
||||
'md5': 'ca237a53a8eb20b6dc5bd60564d4ab3e',
|
||||
'info_dict': {
|
||||
'id': 'Fh3QgymL9gsc',
|
||||
'ext': 'mp4',
|
||||
'title': 'Maze Runner: The Scorch Trials Official Movie Review',
|
||||
'description': 'md5:11ff3362d7ef1d679fdb649f6413975a',
|
||||
'upload_date': '20150918',
|
||||
'timestamp': 1442549540,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
jwplatform_id = self._search_regex(r'jw_?id="([^"]+)"', webpage, 'jwplatform id')
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
'url': 'jwplatform:%s' % jwplatform_id,
|
||||
'ie_key': 'JWPlatform',
|
||||
}
|
@ -10,18 +10,21 @@ from ..utils import int_or_none
|
||||
|
||||
|
||||
class MangomoloBaseIE(InfoExtractor):
|
||||
_BASE_REGEX = r'https?://(?:admin\.mangomolo\.com/analytics/index\.php/customers/embed/|player\.mangomolo\.com/v1/)'
|
||||
|
||||
def _get_real_id(self, page_id):
|
||||
return page_id
|
||||
|
||||
def _real_extract(self, url):
|
||||
page_id = self._get_real_id(self._match_id(url))
|
||||
webpage = self._download_webpage(url, page_id)
|
||||
webpage = self._download_webpage(
|
||||
'https://player.mangomolo.com/v1/%s?%s' % (self._TYPE, url.split('?')[1]), page_id)
|
||||
hidden_inputs = self._hidden_inputs(webpage)
|
||||
m3u8_entry_protocol = 'm3u8' if self._IS_LIVE else 'm3u8_native'
|
||||
|
||||
format_url = self._html_search_regex(
|
||||
[
|
||||
r'file\s*:\s*"(https?://[^"]+?/playlist\.m3u8)',
|
||||
r'(?:file|src)\s*:\s*"(https?://[^"]+?/playlist\.m3u8)',
|
||||
r'<a[^>]+href="(rtsp://[^"]+)"'
|
||||
], webpage, 'format url')
|
||||
formats = self._extract_wowza_formats(
|
||||
@ -39,14 +42,16 @@ class MangomoloBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class MangomoloVideoIE(MangomoloBaseIE):
|
||||
IE_NAME = 'mangomolo:video'
|
||||
_VALID_URL = r'https?://admin\.mangomolo\.com/analytics/index\.php/customers/embed/video\?.*?\bid=(?P<id>\d+)'
|
||||
_TYPE = 'video'
|
||||
IE_NAME = 'mangomolo:' + _TYPE
|
||||
_VALID_URL = MangomoloBaseIE._BASE_REGEX + r'video\?.*?\bid=(?P<id>\d+)'
|
||||
_IS_LIVE = False
|
||||
|
||||
|
||||
class MangomoloLiveIE(MangomoloBaseIE):
|
||||
IE_NAME = 'mangomolo:live'
|
||||
_VALID_URL = r'https?://admin\.mangomolo\.com/analytics/index\.php/customers/embed/index\?.*?\bchannelid=(?P<id>(?:[A-Za-z0-9+/=]|%2B|%2F|%3D)+)'
|
||||
_TYPE = 'live'
|
||||
IE_NAME = 'mangomolo:' + _TYPE
|
||||
_VALID_URL = MangomoloBaseIE._BASE_REGEX + r'(live|index)\?.*?\bchannelid=(?P<id>(?:[A-Za-z0-9+/=]|%2B|%2F|%3D)+)'
|
||||
_IS_LIVE = True
|
||||
|
||||
def _get_real_id(self, page_id):
|
||||
|
@ -27,7 +27,7 @@ class MediasetIE(ThePlatformBaseIE):
|
||||
(?:video|on-demand)/(?:[^/]+/)+[^/]+_|
|
||||
player/index\.html\?.*?\bprogramGuid=
|
||||
)
|
||||
)(?P<id>[0-9A-Z]{16})
|
||||
)(?P<id>[0-9A-Z]{16,})
|
||||
'''
|
||||
_TESTS = [{
|
||||
# full episode
|
||||
@ -62,7 +62,6 @@ class MediasetIE(ThePlatformBaseIE):
|
||||
'uploader': 'Canale 5',
|
||||
'uploader_id': 'C5',
|
||||
},
|
||||
'expected_warnings': ['HTTP Error 403: Forbidden'],
|
||||
}, {
|
||||
# clip
|
||||
'url': 'https://www.mediasetplay.mediaset.it/video/gogglebox/un-grande-classico-della-commedia-sexy_FAFU000000661680',
|
||||
@ -78,6 +77,18 @@ class MediasetIE(ThePlatformBaseIE):
|
||||
}, {
|
||||
'url': 'mediaset:FAFU000000665924',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.mediasetplay.mediaset.it/video/mediasethaacuoreilfuturo/palmieri-alicudi-lisola-dei-tre-bambini-felici--un-decreto-per-alicudi-e-tutte-le-microscuole_FD00000000102295',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.mediasetplay.mediaset.it/video/cherryseason/anticipazioni-degli-episodi-del-23-ottobre_F306837101005C02',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.mediasetplay.mediaset.it/video/tg5/ambiente-onda-umana-per-salvare-il-pianeta_F309453601079D01',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.mediasetplay.mediaset.it/video/grandefratellovip/benedetta-una-doccia-gelata_F309344401044C135',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
@ -109,6 +120,11 @@ class MediasetIE(ThePlatformBaseIE):
|
||||
entries.append(embed_url)
|
||||
return entries
|
||||
|
||||
def _parse_smil_formats(self, smil, smil_url, video_id, namespace=None, f4m_params=None, transform_rtmp_url=None):
|
||||
for video in smil.findall(self._xpath_ns('.//video', namespace)):
|
||||
video.attrib['src'] = re.sub(r'(https?://vod05)t(-mediaset-it\.akamaized\.net/.+?.mpd)\?.+', r'\1\2', video.attrib['src'])
|
||||
return super()._parse_smil_formats(smil, smil_url, video_id, namespace, f4m_params, transform_rtmp_url)
|
||||
|
||||
def _real_extract(self, url):
|
||||
guid = self._match_id(url)
|
||||
tp_path = 'PR1GhC/media/guid/2702976343/' + guid
|
||||
@ -118,14 +134,15 @@ class MediasetIE(ThePlatformBaseIE):
|
||||
subtitles = {}
|
||||
first_e = None
|
||||
for asset_type in ('SD', 'HD'):
|
||||
for f in ('MPEG4', 'MPEG-DASH', 'M3U', 'ISM'):
|
||||
# TODO: fixup ISM+none manifest URLs
|
||||
for f in ('MPEG4', 'MPEG-DASH+none', 'M3U+none'):
|
||||
try:
|
||||
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
||||
update_url_query('http://link.theplatform.%s/s/%s' % (self._TP_TLD, tp_path), {
|
||||
'mbr': 'true',
|
||||
'formats': f,
|
||||
'assetTypes': asset_type,
|
||||
}), guid, 'Downloading %s %s SMIL data' % (f, asset_type))
|
||||
}), guid, 'Downloading %s %s SMIL data' % (f.split('+')[0], asset_type))
|
||||
except ExtractorError as e:
|
||||
if not first_e:
|
||||
first_e = e
|
||||
|
@ -1,70 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
parse_filesize,
|
||||
sanitized_Request,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class MinhatecaIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://minhateca\.com\.br/[^?#]+,(?P<id>[0-9]+)\.'
|
||||
_TEST = {
|
||||
'url': 'http://minhateca.com.br/pereba/misc/youtube-dl+test+video,125848331.mp4(video)',
|
||||
'info_dict': {
|
||||
'id': '125848331',
|
||||
'ext': 'mp4',
|
||||
'title': 'youtube-dl test video',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'filesize_approx': 1530000,
|
||||
'duration': 9,
|
||||
'view_count': int,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
token = self._html_search_regex(
|
||||
r'<input name="__RequestVerificationToken".*?value="([^"]+)"',
|
||||
webpage, 'request token')
|
||||
token_data = [
|
||||
('fileId', video_id),
|
||||
('__RequestVerificationToken', token),
|
||||
]
|
||||
req = sanitized_Request(
|
||||
'http://minhateca.com.br/action/License/Download',
|
||||
data=urlencode_postdata(token_data))
|
||||
req.add_header('Content-Type', 'application/x-www-form-urlencoded')
|
||||
data = self._download_json(
|
||||
req, video_id, note='Downloading metadata')
|
||||
|
||||
video_url = data['redirectUrl']
|
||||
title_str = self._html_search_regex(
|
||||
r'<h1.*?>(.*?)</h1>', webpage, 'title')
|
||||
title, _, ext = title_str.rpartition('.')
|
||||
filesize_approx = parse_filesize(self._html_search_regex(
|
||||
r'<p class="fileSize">(.*?)</p>',
|
||||
webpage, 'file size approximation', fatal=False))
|
||||
duration = parse_duration(self._html_search_regex(
|
||||
r'(?s)<p class="fileLeng[ht][th]">.*?class="bold">(.*?)<',
|
||||
webpage, 'duration', fatal=False))
|
||||
view_count = int_or_none(self._html_search_regex(
|
||||
r'<p class="downloadsCounter">([0-9]+)</p>',
|
||||
webpage, 'view count', fatal=False))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
'ext': ext,
|
||||
'filesize_approx': filesize_approx,
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'thumbnail': self._og_search_thumbnail(webpage),
|
||||
}
|
@ -65,30 +65,6 @@ class TechTVMITIE(InfoExtractor):
|
||||
}
|
||||
|
||||
|
||||
class MITIE(TechTVMITIE):
|
||||
IE_NAME = 'video.mit.edu'
|
||||
_VALID_URL = r'https?://video\.mit\.edu/watch/(?P<title>[^/]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://video.mit.edu/watch/the-government-is-profiling-you-13222/',
|
||||
'md5': '7db01d5ccc1895fc5010e9c9e13648da',
|
||||
'info_dict': {
|
||||
'id': '21783',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Government is Profiling You',
|
||||
'description': 'md5:ad5795fe1e1623b73620dbfd47df9afd',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
page_title = mobj.group('title')
|
||||
webpage = self._download_webpage(url, page_title)
|
||||
embed_url = self._search_regex(
|
||||
r'<iframe .*?src="(.+?)"', webpage, 'embed url')
|
||||
return self.url_result(embed_url)
|
||||
|
||||
|
||||
class OCWMITIE(InfoExtractor):
|
||||
IE_NAME = 'ocw.mit.edu'
|
||||
_VALID_URL = r'^https?://ocw\.mit\.edu/courses/(?P<topic>[a-z0-9\-]+)'
|
||||
|
@ -1,6 +1,5 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import functools
|
||||
import itertools
|
||||
import re
|
||||
|
||||
@ -11,28 +10,37 @@ from ..compat import (
|
||||
compat_ord,
|
||||
compat_str,
|
||||
compat_urllib_parse_unquote,
|
||||
compat_urlparse,
|
||||
compat_zip
|
||||
)
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
OnDemandPagedList,
|
||||
str_to_int,
|
||||
parse_iso8601,
|
||||
strip_or_none,
|
||||
try_get,
|
||||
urljoin,
|
||||
)
|
||||
|
||||
|
||||
class MixcloudIE(InfoExtractor):
|
||||
class MixcloudBaseIE(InfoExtractor):
|
||||
def _call_api(self, object_type, object_fields, display_id, username, slug=None):
|
||||
lookup_key = object_type + 'Lookup'
|
||||
return self._download_json(
|
||||
'https://www.mixcloud.com/graphql', display_id, query={
|
||||
'query': '''{
|
||||
%s(lookup: {username: "%s"%s}) {
|
||||
%s
|
||||
}
|
||||
}''' % (lookup_key, username, ', slug: "%s"' % slug if slug else '', object_fields)
|
||||
})['data'][lookup_key]
|
||||
|
||||
|
||||
class MixcloudIE(MixcloudBaseIE):
|
||||
_VALID_URL = r'https?://(?:(?:www|beta|m)\.)?mixcloud\.com/([^/]+)/(?!stream|uploads|favorites|listens|playlists)([^/]+)'
|
||||
IE_NAME = 'mixcloud'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.mixcloud.com/dholbach/cryptkeeper/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach-cryptkeeper',
|
||||
'id': 'dholbach_cryptkeeper',
|
||||
'ext': 'm4a',
|
||||
'title': 'Cryptkeeper',
|
||||
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
|
||||
@ -40,11 +48,13 @@ class MixcloudIE(InfoExtractor):
|
||||
'uploader_id': 'dholbach',
|
||||
'thumbnail': r're:https?://.*\.jpg',
|
||||
'view_count': int,
|
||||
'timestamp': 1321359578,
|
||||
'upload_date': '20111115',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.mixcloud.com/gillespeterson/caribou-7-inch-vinyl-mix-chat/',
|
||||
'info_dict': {
|
||||
'id': 'gillespeterson-caribou-7-inch-vinyl-mix-chat',
|
||||
'id': 'gillespeterson_caribou-7-inch-vinyl-mix-chat',
|
||||
'ext': 'mp3',
|
||||
'title': 'Caribou 7 inch Vinyl Mix & Chat',
|
||||
'description': 'md5:2b8aec6adce69f9d41724647c65875e8',
|
||||
@ -52,11 +62,14 @@ class MixcloudIE(InfoExtractor):
|
||||
'uploader_id': 'gillespeterson',
|
||||
'thumbnail': 're:https?://.*',
|
||||
'view_count': int,
|
||||
'timestamp': 1422987057,
|
||||
'upload_date': '20150203',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://beta.mixcloud.com/RedLightRadio/nosedrip-15-red-light-radio-01-18-2016/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_DECRYPTION_KEY = 'IFYOUWANTTHEARTISTSTOGETPAIDDONOTDOWNLOADFROMMIXCLOUD'
|
||||
|
||||
@staticmethod
|
||||
def _decrypt_xor_cipher(key, ciphertext):
|
||||
@ -66,176 +79,193 @@ class MixcloudIE(InfoExtractor):
|
||||
for ch, k in compat_zip(ciphertext, itertools.cycle(key))])
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
uploader = mobj.group(1)
|
||||
cloudcast_name = mobj.group(2)
|
||||
track_id = compat_urllib_parse_unquote('-'.join((uploader, cloudcast_name)))
|
||||
username, slug = re.match(self._VALID_URL, url).groups()
|
||||
username, slug = compat_urllib_parse_unquote(username), compat_urllib_parse_unquote(slug)
|
||||
track_id = '%s_%s' % (username, slug)
|
||||
|
||||
webpage = self._download_webpage(url, track_id)
|
||||
cloudcast = self._call_api('cloudcast', '''audioLength
|
||||
comments(first: 100) {
|
||||
edges {
|
||||
node {
|
||||
comment
|
||||
created
|
||||
user {
|
||||
displayName
|
||||
username
|
||||
}
|
||||
}
|
||||
}
|
||||
totalCount
|
||||
}
|
||||
description
|
||||
favorites {
|
||||
totalCount
|
||||
}
|
||||
featuringArtistList
|
||||
isExclusive
|
||||
name
|
||||
owner {
|
||||
displayName
|
||||
url
|
||||
username
|
||||
}
|
||||
picture(width: 1024, height: 1024) {
|
||||
url
|
||||
}
|
||||
plays
|
||||
publishDate
|
||||
reposts {
|
||||
totalCount
|
||||
}
|
||||
streamInfo {
|
||||
dashUrl
|
||||
hlsUrl
|
||||
url
|
||||
}
|
||||
tags {
|
||||
tag {
|
||||
name
|
||||
}
|
||||
}''', track_id, username, slug)
|
||||
|
||||
# Legacy path
|
||||
encrypted_play_info = self._search_regex(
|
||||
r'm-play-info="([^"]+)"', webpage, 'play info', default=None)
|
||||
title = cloudcast['name']
|
||||
|
||||
if encrypted_play_info is not None:
|
||||
# Decode
|
||||
encrypted_play_info = compat_b64decode(encrypted_play_info)
|
||||
else:
|
||||
# New path
|
||||
full_info_json = self._parse_json(self._html_search_regex(
|
||||
r'<script id="relay-data" type="text/x-mixcloud">([^<]+)</script>',
|
||||
webpage, 'play info'), 'play info')
|
||||
for item in full_info_json:
|
||||
item_data = try_get(
|
||||
item, lambda x: x['cloudcast']['data']['cloudcastLookup'],
|
||||
dict)
|
||||
if try_get(item_data, lambda x: x['streamInfo']['url']):
|
||||
info_json = item_data
|
||||
break
|
||||
else:
|
||||
raise ExtractorError('Failed to extract matching stream info')
|
||||
stream_info = cloudcast['streamInfo']
|
||||
formats = []
|
||||
|
||||
message = self._html_search_regex(
|
||||
r'(?s)<div[^>]+class="global-message cloudcast-disabled-notice-light"[^>]*>(.+?)<(?:a|/div)',
|
||||
webpage, 'error message', default=None)
|
||||
|
||||
js_url = self._search_regex(
|
||||
r'<script[^>]+\bsrc=["\"](https://(?:www\.)?mixcloud\.com/media/(?:js2/www_js_4|js/www)\.[^>]+\.js)',
|
||||
webpage, 'js url')
|
||||
js = self._download_webpage(js_url, track_id, 'Downloading JS')
|
||||
# Known plaintext attack
|
||||
if encrypted_play_info:
|
||||
kps = ['{"stream_url":']
|
||||
kpa_target = encrypted_play_info
|
||||
else:
|
||||
kps = ['https://', 'http://']
|
||||
kpa_target = compat_b64decode(info_json['streamInfo']['url'])
|
||||
for kp in kps:
|
||||
partial_key = self._decrypt_xor_cipher(kpa_target, kp)
|
||||
for quote in ["'", '"']:
|
||||
key = self._search_regex(
|
||||
r'{0}({1}[^{0}]*){0}'.format(quote, re.escape(partial_key)),
|
||||
js, 'encryption key', default=None)
|
||||
if key is not None:
|
||||
break
|
||||
else:
|
||||
for url_key in ('url', 'hlsUrl', 'dashUrl'):
|
||||
format_url = stream_info.get(url_key)
|
||||
if not format_url:
|
||||
continue
|
||||
break
|
||||
else:
|
||||
raise ExtractorError('Failed to extract encryption key')
|
||||
decrypted = self._decrypt_xor_cipher(
|
||||
self._DECRYPTION_KEY, compat_b64decode(format_url))
|
||||
if url_key == 'hlsUrl':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
decrypted, track_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif url_key == 'dashUrl':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
decrypted, track_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'format_id': 'http',
|
||||
'url': decrypted,
|
||||
'downloader_options': {
|
||||
# Mixcloud starts throttling at >~5M
|
||||
'http_chunk_size': 5242880,
|
||||
},
|
||||
})
|
||||
|
||||
if encrypted_play_info is not None:
|
||||
play_info = self._parse_json(self._decrypt_xor_cipher(key, encrypted_play_info), 'play info')
|
||||
if message and 'stream_url' not in play_info:
|
||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, message), expected=True)
|
||||
song_url = play_info['stream_url']
|
||||
formats = [{
|
||||
'format_id': 'normal',
|
||||
'url': song_url
|
||||
}]
|
||||
if not formats and cloudcast.get('isExclusive'):
|
||||
self.raise_login_required()
|
||||
|
||||
title = self._html_search_regex(r'm-title="([^"]+)"', webpage, 'title')
|
||||
thumbnail = self._proto_relative_url(self._html_search_regex(
|
||||
r'm-thumbnail-url="([^"]+)"', webpage, 'thumbnail', fatal=False))
|
||||
uploader = self._html_search_regex(
|
||||
r'm-owner-name="([^"]+)"', webpage, 'uploader', fatal=False)
|
||||
uploader_id = self._search_regex(
|
||||
r'\s+"profile": "([^"]+)",', webpage, 'uploader id', fatal=False)
|
||||
description = self._og_search_description(webpage)
|
||||
view_count = str_to_int(self._search_regex(
|
||||
[r'<meta itemprop="interactionCount" content="UserPlays:([0-9]+)"',
|
||||
r'/listeners/?">([0-9,.]+)</a>',
|
||||
r'(?:m|data)-tooltip=["\']([\d,.]+) plays'],
|
||||
webpage, 'play count', default=None))
|
||||
self._sort_formats(formats)
|
||||
|
||||
else:
|
||||
title = info_json['name']
|
||||
thumbnail = urljoin(
|
||||
'https://thumbnailer.mixcloud.com/unsafe/600x600/',
|
||||
try_get(info_json, lambda x: x['picture']['urlRoot'], compat_str))
|
||||
uploader = try_get(info_json, lambda x: x['owner']['displayName'])
|
||||
uploader_id = try_get(info_json, lambda x: x['owner']['username'])
|
||||
description = try_get(info_json, lambda x: x['description'])
|
||||
view_count = int_or_none(try_get(info_json, lambda x: x['plays']))
|
||||
comments = []
|
||||
for edge in (try_get(cloudcast, lambda x: x['comments']['edges']) or []):
|
||||
node = edge.get('node') or {}
|
||||
text = strip_or_none(node.get('comment'))
|
||||
if not text:
|
||||
continue
|
||||
user = node.get('user') or {}
|
||||
comments.append({
|
||||
'author': user.get('displayName'),
|
||||
'author_id': user.get('username'),
|
||||
'text': text,
|
||||
'timestamp': parse_iso8601(node.get('created')),
|
||||
})
|
||||
|
||||
stream_info = info_json['streamInfo']
|
||||
formats = []
|
||||
tags = []
|
||||
for t in cloudcast.get('tags'):
|
||||
tag = try_get(t, lambda x: x['tag']['name'], compat_str)
|
||||
if not tag:
|
||||
tags.append(tag)
|
||||
|
||||
def decrypt_url(f_url):
|
||||
for k in (key, 'IFYOUWANTTHEARTISTSTOGETPAIDDONOTDOWNLOADFROMMIXCLOUD'):
|
||||
decrypted_url = self._decrypt_xor_cipher(k, f_url)
|
||||
if re.search(r'^https?://[0-9A-Za-z.]+/[0-9A-Za-z/.?=&_-]+$', decrypted_url):
|
||||
return decrypted_url
|
||||
get_count = lambda x: int_or_none(try_get(cloudcast, lambda y: y[x]['totalCount']))
|
||||
|
||||
for url_key in ('url', 'hlsUrl', 'dashUrl'):
|
||||
format_url = stream_info.get(url_key)
|
||||
if not format_url:
|
||||
continue
|
||||
decrypted = decrypt_url(compat_b64decode(format_url))
|
||||
if not decrypted:
|
||||
continue
|
||||
if url_key == 'hlsUrl':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
decrypted, track_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif url_key == 'dashUrl':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
decrypted, track_id, mpd_id='dash', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'format_id': 'http',
|
||||
'url': decrypted,
|
||||
'downloader_options': {
|
||||
# Mixcloud starts throttling at >~5M
|
||||
'http_chunk_size': 5242880,
|
||||
},
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
owner = cloudcast.get('owner') or {}
|
||||
|
||||
return {
|
||||
'id': track_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'uploader': uploader,
|
||||
'uploader_id': uploader_id,
|
||||
'view_count': view_count,
|
||||
'description': cloudcast.get('description'),
|
||||
'thumbnail': try_get(cloudcast, lambda x: x['picture']['url'], compat_str),
|
||||
'uploader': owner.get('displayName'),
|
||||
'timestamp': parse_iso8601(cloudcast.get('publishDate')),
|
||||
'uploader_id': owner.get('username'),
|
||||
'uploader_url': owner.get('url'),
|
||||
'duration': int_or_none(cloudcast.get('audioLength')),
|
||||
'view_count': int_or_none(cloudcast.get('plays')),
|
||||
'like_count': get_count('favorites'),
|
||||
'repost_count': get_count('reposts'),
|
||||
'comment_count': get_count('comments'),
|
||||
'comments': comments,
|
||||
'tags': tags,
|
||||
'artist': ', '.join(cloudcast.get('featuringArtistList') or []) or None,
|
||||
}
|
||||
|
||||
|
||||
class MixcloudPlaylistBaseIE(InfoExtractor):
|
||||
_PAGE_SIZE = 24
|
||||
class MixcloudPlaylistBaseIE(MixcloudBaseIE):
|
||||
def _get_cloudcast(self, node):
|
||||
return node
|
||||
|
||||
def _find_urls_in_page(self, page):
|
||||
for url in re.findall(r'm-play-button m-url="(?P<url>[^"]+)"', page):
|
||||
yield self.url_result(
|
||||
compat_urlparse.urljoin('https://www.mixcloud.com', clean_html(url)),
|
||||
MixcloudIE.ie_key())
|
||||
def _get_playlist_title(self, title, slug):
|
||||
return title
|
||||
|
||||
def _fetch_tracks_page(self, path, video_id, page_name, current_page, real_page_number=None):
|
||||
real_page_number = real_page_number or current_page + 1
|
||||
return self._download_webpage(
|
||||
'https://www.mixcloud.com/%s/' % path, video_id,
|
||||
note='Download %s (page %d)' % (page_name, current_page + 1),
|
||||
errnote='Unable to download %s' % page_name,
|
||||
query={'page': real_page_number, 'list': 'main', '_ajax': '1'},
|
||||
headers={'X-Requested-With': 'XMLHttpRequest'})
|
||||
def _real_extract(self, url):
|
||||
username, slug = re.match(self._VALID_URL, url).groups()
|
||||
username = compat_urllib_parse_unquote(username)
|
||||
if not slug:
|
||||
slug = 'uploads'
|
||||
else:
|
||||
slug = compat_urllib_parse_unquote(slug)
|
||||
playlist_id = '%s_%s' % (username, slug)
|
||||
|
||||
def _tracks_page_func(self, page, video_id, page_name, current_page):
|
||||
resp = self._fetch_tracks_page(page, video_id, page_name, current_page)
|
||||
is_playlist_type = self._ROOT_TYPE == 'playlist'
|
||||
playlist_type = 'items' if is_playlist_type else slug
|
||||
list_filter = ''
|
||||
|
||||
for item in self._find_urls_in_page(resp):
|
||||
yield item
|
||||
has_next_page = True
|
||||
entries = []
|
||||
while has_next_page:
|
||||
playlist = self._call_api(
|
||||
self._ROOT_TYPE, '''%s
|
||||
%s
|
||||
%s(first: 100%s) {
|
||||
edges {
|
||||
node {
|
||||
%s
|
||||
}
|
||||
}
|
||||
pageInfo {
|
||||
endCursor
|
||||
hasNextPage
|
||||
}
|
||||
}''' % (self._TITLE_KEY, self._DESCRIPTION_KEY, playlist_type, list_filter, self._NODE_TEMPLATE),
|
||||
playlist_id, username, slug if is_playlist_type else None)
|
||||
|
||||
def _get_user_description(self, page_content):
|
||||
return self._html_search_regex(
|
||||
r'<div[^>]+class="profile-bio"[^>]*>(.+?)</div>',
|
||||
page_content, 'user description', fatal=False)
|
||||
items = playlist.get(playlist_type) or {}
|
||||
for edge in items.get('edges', []):
|
||||
cloudcast = self._get_cloudcast(edge.get('node') or {})
|
||||
cloudcast_url = cloudcast.get('url')
|
||||
if not cloudcast_url:
|
||||
continue
|
||||
entries.append(self.url_result(
|
||||
cloudcast_url, MixcloudIE.ie_key(), cloudcast.get('slug')))
|
||||
|
||||
page_info = items['pageInfo']
|
||||
has_next_page = page_info['hasNextPage']
|
||||
list_filter = ', after: "%s"' % page_info['endCursor']
|
||||
|
||||
return self.playlist_result(
|
||||
entries, playlist_id,
|
||||
self._get_playlist_title(playlist[self._TITLE_KEY], slug),
|
||||
playlist.get(self._DESCRIPTION_KEY))
|
||||
|
||||
|
||||
class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?mixcloud\.com/(?P<user>[^/]+)/(?P<type>uploads|favorites|listens)?/?$'
|
||||
_VALID_URL = r'https?://(?:www\.)?mixcloud\.com/(?P<id>[^/]+)/(?P<type>uploads|favorites|listens|stream)?/?$'
|
||||
IE_NAME = 'mixcloud:user'
|
||||
|
||||
_TESTS = [{
|
||||
@ -243,68 +273,58 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
|
||||
'info_dict': {
|
||||
'id': 'dholbach_uploads',
|
||||
'title': 'Daniel Holbach (uploads)',
|
||||
'description': 'md5:def36060ac8747b3aabca54924897e47',
|
||||
'description': 'md5:b60d776f0bab534c5dabe0a34e47a789',
|
||||
},
|
||||
'playlist_mincount': 11,
|
||||
'playlist_mincount': 36,
|
||||
}, {
|
||||
'url': 'http://www.mixcloud.com/dholbach/uploads/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach_uploads',
|
||||
'title': 'Daniel Holbach (uploads)',
|
||||
'description': 'md5:def36060ac8747b3aabca54924897e47',
|
||||
'description': 'md5:b60d776f0bab534c5dabe0a34e47a789',
|
||||
},
|
||||
'playlist_mincount': 11,
|
||||
'playlist_mincount': 36,
|
||||
}, {
|
||||
'url': 'http://www.mixcloud.com/dholbach/favorites/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach_favorites',
|
||||
'title': 'Daniel Holbach (favorites)',
|
||||
'description': 'md5:def36060ac8747b3aabca54924897e47',
|
||||
'description': 'md5:b60d776f0bab534c5dabe0a34e47a789',
|
||||
},
|
||||
'params': {
|
||||
'playlist_items': '1-100',
|
||||
},
|
||||
'playlist_mincount': 100,
|
||||
# 'params': {
|
||||
# 'playlist_items': '1-100',
|
||||
# },
|
||||
'playlist_mincount': 396,
|
||||
}, {
|
||||
'url': 'http://www.mixcloud.com/dholbach/listens/',
|
||||
'info_dict': {
|
||||
'id': 'dholbach_listens',
|
||||
'title': 'Daniel Holbach (listens)',
|
||||
'description': 'md5:def36060ac8747b3aabca54924897e47',
|
||||
'description': 'md5:b60d776f0bab534c5dabe0a34e47a789',
|
||||
},
|
||||
'params': {
|
||||
'playlist_items': '1-100',
|
||||
# 'params': {
|
||||
# 'playlist_items': '1-100',
|
||||
# },
|
||||
'playlist_mincount': 1623,
|
||||
'skip': 'Large list',
|
||||
}, {
|
||||
'url': 'https://www.mixcloud.com/FirstEar/stream/',
|
||||
'info_dict': {
|
||||
'id': 'FirstEar_stream',
|
||||
'title': 'First Ear (stream)',
|
||||
'description': 'Curators of good music\r\n\r\nfirstearmusic.com',
|
||||
},
|
||||
'playlist_mincount': 100,
|
||||
'playlist_mincount': 271,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
user_id = mobj.group('user')
|
||||
list_type = mobj.group('type')
|
||||
_TITLE_KEY = 'displayName'
|
||||
_DESCRIPTION_KEY = 'biog'
|
||||
_ROOT_TYPE = 'user'
|
||||
_NODE_TEMPLATE = '''slug
|
||||
url'''
|
||||
|
||||
# if only a profile URL was supplied, default to download all uploads
|
||||
if list_type is None:
|
||||
list_type = 'uploads'
|
||||
|
||||
video_id = '%s_%s' % (user_id, list_type)
|
||||
|
||||
profile = self._download_webpage(
|
||||
'https://www.mixcloud.com/%s/' % user_id, video_id,
|
||||
note='Downloading user profile',
|
||||
errnote='Unable to download user profile')
|
||||
|
||||
username = self._og_search_title(profile)
|
||||
description = self._get_user_description(profile)
|
||||
|
||||
entries = OnDemandPagedList(
|
||||
functools.partial(
|
||||
self._tracks_page_func,
|
||||
'%s/%s' % (user_id, list_type), video_id, 'list of %s' % list_type),
|
||||
self._PAGE_SIZE)
|
||||
|
||||
return self.playlist_result(
|
||||
entries, video_id, '%s (%s)' % (username, list_type), description)
|
||||
def _get_playlist_title(self, title, slug):
|
||||
return '%s (%s)' % (title, slug)
|
||||
|
||||
|
||||
class MixcloudPlaylistIE(MixcloudPlaylistBaseIE):
|
||||
@ -312,87 +332,20 @@ class MixcloudPlaylistIE(MixcloudPlaylistBaseIE):
|
||||
IE_NAME = 'mixcloud:playlist'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://www.mixcloud.com/RedBullThre3style/playlists/tokyo-finalists-2015/',
|
||||
'info_dict': {
|
||||
'id': 'RedBullThre3style_tokyo-finalists-2015',
|
||||
'title': 'National Champions 2015',
|
||||
'description': 'md5:6ff5fb01ac76a31abc9b3939c16243a3',
|
||||
},
|
||||
'playlist_mincount': 16,
|
||||
}, {
|
||||
'url': 'https://www.mixcloud.com/maxvibes/playlists/jazzcat-on-ness-radio/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
user_id = mobj.group('user')
|
||||
playlist_id = mobj.group('playlist')
|
||||
video_id = '%s_%s' % (user_id, playlist_id)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
url, user_id,
|
||||
note='Downloading playlist page',
|
||||
errnote='Unable to download playlist page')
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<a[^>]+class="parent active"[^>]*><b>\d+</b><span[^>]*>([^<]+)',
|
||||
webpage, 'playlist title',
|
||||
default=None) or self._og_search_title(webpage, fatal=False)
|
||||
description = self._get_user_description(webpage)
|
||||
|
||||
entries = OnDemandPagedList(
|
||||
functools.partial(
|
||||
self._tracks_page_func,
|
||||
'%s/playlists/%s' % (user_id, playlist_id), video_id, 'tracklist'),
|
||||
self._PAGE_SIZE)
|
||||
|
||||
return self.playlist_result(entries, video_id, title, description)
|
||||
|
||||
|
||||
class MixcloudStreamIE(MixcloudPlaylistBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?mixcloud\.com/(?P<id>[^/]+)/stream/?$'
|
||||
IE_NAME = 'mixcloud:stream'
|
||||
|
||||
_TEST = {
|
||||
'url': 'https://www.mixcloud.com/FirstEar/stream/',
|
||||
'info_dict': {
|
||||
'id': 'FirstEar',
|
||||
'title': 'First Ear',
|
||||
'description': 'Curators of good music\nfirstearmusic.com',
|
||||
'id': 'maxvibes_jazzcat-on-ness-radio',
|
||||
'title': 'Ness Radio sessions',
|
||||
},
|
||||
'playlist_mincount': 192,
|
||||
}
|
||||
'playlist_mincount': 59,
|
||||
}]
|
||||
_TITLE_KEY = 'name'
|
||||
_DESCRIPTION_KEY = 'description'
|
||||
_ROOT_TYPE = 'playlist'
|
||||
_NODE_TEMPLATE = '''cloudcast {
|
||||
slug
|
||||
url
|
||||
}'''
|
||||
|
||||
def _real_extract(self, url):
|
||||
user_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, user_id)
|
||||
|
||||
entries = []
|
||||
prev_page_url = None
|
||||
|
||||
def _handle_page(page):
|
||||
entries.extend(self._find_urls_in_page(page))
|
||||
return self._search_regex(
|
||||
r'm-next-page-url="([^"]+)"', page,
|
||||
'next page URL', default=None)
|
||||
|
||||
next_page_url = _handle_page(webpage)
|
||||
|
||||
for idx in itertools.count(0):
|
||||
if not next_page_url or prev_page_url == next_page_url:
|
||||
break
|
||||
|
||||
prev_page_url = next_page_url
|
||||
current_page = int(self._search_regex(
|
||||
r'\?page=(\d+)', next_page_url, 'next page number'))
|
||||
|
||||
next_page_url = _handle_page(self._fetch_tracks_page(
|
||||
'%s/stream' % user_id, user_id, 'stream', idx,
|
||||
real_page_number=current_page))
|
||||
|
||||
username = self._og_search_title(webpage)
|
||||
description = self._get_user_description(webpage)
|
||||
|
||||
return self.playlist_result(entries, user_id, username, description)
|
||||
def _get_cloudcast(self, node):
|
||||
return node.get('cloudcast') or {}
|
||||
|
@ -41,6 +41,14 @@ class MSNIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.msn.com/en-ae/entertainment/bollywood/watch-how-salman-khan-reacted-when-asked-if-he-would-apologize-for-his-‘raped-woman’-comment/vi-AAhvzW6',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# Vidible(AOL) Embed
|
||||
'url': 'https://www.msn.com/en-us/video/animals/yellowstone-park-staffers-catch-deer-engaged-in-behavior-they-cant-explain/vi-AAGfdg1',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# Dailymotion Embed
|
||||
'url': 'https://www.msn.com/es-ve/entretenimiento/watch/winston-salem-paire-refait-des-siennes-en-perdant-sa-raquette-au-service/vp-AAG704L',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -61,6 +69,18 @@ class MSNIE(InfoExtractor):
|
||||
webpage, 'error', group='error'))
|
||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, error), expected=True)
|
||||
|
||||
player_name = video.get('playerName')
|
||||
if player_name:
|
||||
provider_id = video.get('providerId')
|
||||
if provider_id:
|
||||
if player_name == 'AOL':
|
||||
return self.url_result(
|
||||
'aol-video:' + provider_id, 'Aol', provider_id)
|
||||
elif player_name == 'Dailymotion':
|
||||
return self.url_result(
|
||||
'https://www.dailymotion.com/video/' + provider_id,
|
||||
'Dailymotion', provider_id)
|
||||
|
||||
title = video['title']
|
||||
|
||||
formats = []
|
||||
|
@ -1,3 +1,4 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
@ -349,33 +350,29 @@ class MTVIE(MTVServicesInfoExtractor):
|
||||
}]
|
||||
|
||||
|
||||
class MTV81IE(InfoExtractor):
|
||||
IE_NAME = 'mtv81'
|
||||
_VALID_URL = r'https?://(?:www\.)?mtv81\.com/videos/(?P<id>[^/?#.]+)'
|
||||
class MTVJapanIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'mtvjapan'
|
||||
_VALID_URL = r'https?://(?:www\.)?mtvjapan\.com/videos/(?P<id>[0-9a-z]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.mtv81.com/videos/artist-to-watch/the-godfather-of-japanese-hip-hop-segment-1/',
|
||||
'md5': '1edbcdf1e7628e414a8c5dcebca3d32b',
|
||||
'url': 'http://www.mtvjapan.com/videos/prayht/fresh-info-cadillac-escalade',
|
||||
'info_dict': {
|
||||
'id': '5e14040d-18a4-47c4-a582-43ff602de88e',
|
||||
'id': 'bc01da03-6fe5-4284-8880-f291f4e368f5',
|
||||
'ext': 'mp4',
|
||||
'title': 'Unlocking The Truth|July 18, 2016|1|101|Trailer',
|
||||
'description': '"Unlocking the Truth" premieres August 17th at 11/10c.',
|
||||
'timestamp': 1468846800,
|
||||
'upload_date': '20160718',
|
||||
'title': '【Fresh Info】Cadillac ESCALADE Sport Edition',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}
|
||||
_GEO_COUNTRIES = ['JP']
|
||||
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
|
||||
|
||||
def _extract_mgid(self, webpage):
|
||||
return self._search_regex(
|
||||
r'getTheVideo\((["\'])(?P<id>mgid:.+?)\1', webpage,
|
||||
'mgid', group='id')
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
mgid = self._extract_mgid(webpage)
|
||||
return self.url_result('http://media.mtvnservices.com/embed/%s' % mgid)
|
||||
def _get_feed_query(self, uri):
|
||||
return {
|
||||
'arcEp': 'mtvjapan.com',
|
||||
'mgid': uri,
|
||||
}
|
||||
|
||||
|
||||
class MTVVideoIE(MTVServicesInfoExtractor):
|
||||
@ -425,14 +422,14 @@ class MTVVideoIE(MTVServicesInfoExtractor):
|
||||
|
||||
class MTVDEIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'mtv.de'
|
||||
_VALID_URL = r'https?://(?:www\.)?mtv\.de/(?:artists|shows|news)/(?:[^/]+/)*(?P<id>\d+)-[^/#?]+/*(?:[#?].*)?$'
|
||||
_VALID_URL = r'https?://(?:www\.)?mtv\.de/(?:musik/videoclips|folgen|news)/(?P<id>[0-9a-z]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.mtv.de/artists/10571-cro/videos/61131-traum',
|
||||
'url': 'http://www.mtv.de/musik/videoclips/2gpnv7/Traum',
|
||||
'info_dict': {
|
||||
'id': 'music_video-a50bc5f0b3aa4b3190aa',
|
||||
'ext': 'flv',
|
||||
'title': 'MusicVideo_cro-traum',
|
||||
'description': 'Cro - Traum',
|
||||
'id': 'd5d472bc-f5b7-11e5-bffd-a4badb20dab5',
|
||||
'ext': 'mp4',
|
||||
'title': 'Traum',
|
||||
'description': 'Traum',
|
||||
},
|
||||
'params': {
|
||||
# rtmp download
|
||||
@ -441,11 +438,12 @@ class MTVDEIE(MTVServicesInfoExtractor):
|
||||
'skip': 'Blocked at Travis CI',
|
||||
}, {
|
||||
# mediagen URL without query (e.g. http://videos.mtvnn.com/mediagen/e865da714c166d18d6f80893195fcb97)
|
||||
'url': 'http://www.mtv.de/shows/933-teen-mom-2/staffeln/5353/folgen/63565-enthullungen',
|
||||
'url': 'http://www.mtv.de/folgen/6b1ylu/teen-mom-2-enthuellungen-S5-F1',
|
||||
'info_dict': {
|
||||
'id': 'local_playlist-f5ae778b9832cc837189',
|
||||
'ext': 'flv',
|
||||
'title': 'Episode_teen-mom-2_shows_season-5_episode-1_full-episode_part1',
|
||||
'id': '1e5a878b-31c5-11e7-a442-0e40cf2fc285',
|
||||
'ext': 'mp4',
|
||||
'title': 'Teen Mom 2',
|
||||
'description': 'md5:dc65e357ef7e1085ed53e9e9d83146a7',
|
||||
},
|
||||
'params': {
|
||||
# rtmp download
|
||||
@ -453,7 +451,7 @@ class MTVDEIE(MTVServicesInfoExtractor):
|
||||
},
|
||||
'skip': 'Blocked at Travis CI',
|
||||
}, {
|
||||
'url': 'http://www.mtv.de/news/77491-mtv-movies-spotlight-pixels-teil-3',
|
||||
'url': 'http://www.mtv.de/news/glolix/77491-mtv-movies-spotlight--pixels--teil-3',
|
||||
'info_dict': {
|
||||
'id': 'local_playlist-4e760566473c4c8c5344',
|
||||
'ext': 'mp4',
|
||||
@ -466,25 +464,11 @@ class MTVDEIE(MTVServicesInfoExtractor):
|
||||
},
|
||||
'skip': 'Das Video kann zur Zeit nicht abgespielt werden.',
|
||||
}]
|
||||
_GEO_COUNTRIES = ['DE']
|
||||
_FEED_URL = 'http://feeds.mtvnservices.com/od/feed/intl-mrss-player-feed'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
playlist = self._parse_json(
|
||||
self._search_regex(
|
||||
r'window\.pagePlaylist\s*=\s*(\[.+?\]);\n', webpage, 'page playlist'),
|
||||
video_id)
|
||||
|
||||
def _mrss_url(item):
|
||||
return item['mrss'] + item.get('mrssvars', '')
|
||||
|
||||
# news pages contain single video in playlist with different id
|
||||
if len(playlist) == 1:
|
||||
return self._get_videos_info_from_url(_mrss_url(playlist[0]), video_id)
|
||||
|
||||
for item in playlist:
|
||||
item_id = item.get('id')
|
||||
if item_id and compat_str(item_id) == video_id:
|
||||
return self._get_videos_info_from_url(_mrss_url(item), video_id)
|
||||
def _get_feed_query(self, uri):
|
||||
return {
|
||||
'arcEp': 'mtv.de',
|
||||
'mgid': uri,
|
||||
}
|
||||
|
@ -1,73 +1,56 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
import os.path
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_urllib_parse_urlparse,
|
||||
)
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
xpath_text,
|
||||
)
|
||||
|
||||
|
||||
class MySpassIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?myspass\.de/.*'
|
||||
_VALID_URL = r'https?://(?:www\.)?myspass\.de/([^/]+/)*(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.myspass.de/myspass/shows/tvshows/absolute-mehrheit/Absolute-Mehrheit-vom-17022013-Die-Highlights-Teil-2--/11741/',
|
||||
'md5': '0b49f4844a068f8b33f4b7c88405862b',
|
||||
'info_dict': {
|
||||
'id': '11741',
|
||||
'ext': 'mp4',
|
||||
'description': 'Wer kann in die Fu\u00dfstapfen von Wolfgang Kubicki treten und die Mehrheit der Zuschauer hinter sich versammeln? Wird vielleicht sogar die Absolute Mehrheit geknackt und der Jackpot von 200.000 Euro mit nach Hause genommen?',
|
||||
'title': 'Absolute Mehrheit vom 17.02.2013 - Die Highlights, Teil 2',
|
||||
'description': 'Wer kann in die Fußstapfen von Wolfgang Kubicki treten und die Mehrheit der Zuschauer hinter sich versammeln? Wird vielleicht sogar die Absolute Mehrheit geknackt und der Jackpot von 200.000 Euro mit nach Hause genommen?',
|
||||
'title': '17.02.2013 - Die Highlights, Teil 2',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
META_DATA_URL_TEMPLATE = 'http://www.myspass.de/myspass/includes/apps/video/getvideometadataxml.php?id=%s'
|
||||
video_id = self._match_id(url)
|
||||
|
||||
# video id is the last path element of the URL
|
||||
# usually there is a trailing slash, so also try the second but last
|
||||
url_path = compat_urllib_parse_urlparse(url).path
|
||||
url_parent_path, video_id = os.path.split(url_path)
|
||||
if not video_id:
|
||||
_, video_id = os.path.split(url_parent_path)
|
||||
|
||||
# get metadata
|
||||
metadata_url = META_DATA_URL_TEMPLATE % video_id
|
||||
metadata = self._download_xml(
|
||||
metadata_url, video_id, transform_source=lambda s: s.strip())
|
||||
'http://www.myspass.de/myspass/includes/apps/video/getvideometadataxml.php?id=' + video_id,
|
||||
video_id)
|
||||
|
||||
# extract values from metadata
|
||||
url_flv_el = metadata.find('url_flv')
|
||||
if url_flv_el is None:
|
||||
raise ExtractorError('Unable to extract download url')
|
||||
video_url = url_flv_el.text
|
||||
title_el = metadata.find('title')
|
||||
if title_el is None:
|
||||
raise ExtractorError('Unable to extract title')
|
||||
title = title_el.text
|
||||
format_id_el = metadata.find('format_id')
|
||||
if format_id_el is None:
|
||||
format = 'mp4'
|
||||
else:
|
||||
format = format_id_el.text
|
||||
description_el = metadata.find('description')
|
||||
if description_el is not None:
|
||||
description = description_el.text
|
||||
else:
|
||||
description = None
|
||||
imagePreview_el = metadata.find('imagePreview')
|
||||
if imagePreview_el is not None:
|
||||
thumbnail = imagePreview_el.text
|
||||
else:
|
||||
thumbnail = None
|
||||
title = xpath_text(metadata, 'title', fatal=True)
|
||||
video_url = xpath_text(metadata, 'url_flv', 'download url', True)
|
||||
video_id_int = int(video_id)
|
||||
for group in re.search(r'/myspass2009/\d+/(\d+)/(\d+)/(\d+)/', video_url).groups():
|
||||
group_int = int(group)
|
||||
if group_int > video_id_int:
|
||||
video_url = video_url.replace(
|
||||
group, compat_str(group_int // video_id_int))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
'format': format,
|
||||
'thumbnail': thumbnail,
|
||||
'description': description,
|
||||
'thumbnail': xpath_text(metadata, 'imagePreview'),
|
||||
'description': xpath_text(metadata, 'description'),
|
||||
'duration': parse_duration(xpath_text(metadata, 'duration')),
|
||||
'series': xpath_text(metadata, 'format'),
|
||||
'season_number': int_or_none(xpath_text(metadata, 'season')),
|
||||
'season_id': xpath_text(metadata, 'season_id'),
|
||||
'episode': title,
|
||||
'episode_number': int_or_none(xpath_text(metadata, 'episode')),
|
||||
}
|
||||
|
@ -9,10 +9,13 @@ from .theplatform import ThePlatformIE
|
||||
from .adobepass import AdobePassIE
|
||||
from ..compat import compat_urllib_parse_unquote
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
parse_duration,
|
||||
smuggle_url,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
update_url_query,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
@ -85,27 +88,41 @@ class NBCIE(AdobePassIE):
|
||||
permalink, video_id = re.match(self._VALID_URL, url).groups()
|
||||
permalink = 'http' + compat_urllib_parse_unquote(permalink)
|
||||
response = self._download_json(
|
||||
'https://api.nbc.com/v3/videos', video_id, query={
|
||||
'filter[permalink]': permalink,
|
||||
'fields[videos]': 'description,entitlement,episodeNumber,guid,keywords,seasonNumber,title,vChipRating',
|
||||
'fields[shows]': 'shortTitle',
|
||||
'include': 'show.shortTitle',
|
||||
'https://friendship.nbc.co/v2/graphql', video_id, query={
|
||||
'query': '''{
|
||||
page(name: "%s", platform: web, type: VIDEO, userId: "0") {
|
||||
data {
|
||||
... on VideoPageData {
|
||||
description
|
||||
episodeNumber
|
||||
keywords
|
||||
locked
|
||||
mpxAccountId
|
||||
mpxGuid
|
||||
rating
|
||||
seasonNumber
|
||||
secondaryTitle
|
||||
seriesShortTitle
|
||||
}
|
||||
}
|
||||
}
|
||||
}''' % permalink,
|
||||
})
|
||||
video_data = response['data'][0]['attributes']
|
||||
video_data = response['data']['page']['data']
|
||||
query = {
|
||||
'mbr': 'true',
|
||||
'manifest': 'm3u',
|
||||
}
|
||||
video_id = video_data['guid']
|
||||
title = video_data['title']
|
||||
if video_data.get('entitlement') == 'auth':
|
||||
video_id = video_data['mpxGuid']
|
||||
title = video_data['secondaryTitle']
|
||||
if video_data.get('locked'):
|
||||
resource = self._get_mvpd_resource(
|
||||
'nbcentertainment', title, video_id,
|
||||
video_data.get('vChipRating'))
|
||||
video_data.get('rating'))
|
||||
query['auth'] = self._extract_mvpd_auth(
|
||||
url, video_id, 'nbcentertainment', resource)
|
||||
theplatform_url = smuggle_url(update_url_query(
|
||||
'http://link.theplatform.com/s/NnzsPC/media/guid/2410887629/' + video_id,
|
||||
'http://link.theplatform.com/s/NnzsPC/media/guid/%s/%s' % (video_data.get('mpxAccountId') or '2410887629', video_id),
|
||||
query), {'force_smil_url': True})
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
@ -117,7 +134,7 @@ class NBCIE(AdobePassIE):
|
||||
'season_number': int_or_none(video_data.get('seasonNumber')),
|
||||
'episode_number': int_or_none(video_data.get('episodeNumber')),
|
||||
'episode': title,
|
||||
'series': try_get(response, lambda x: x['included'][0]['attributes']['shortTitle']),
|
||||
'series': video_data.get('seriesShortTitle'),
|
||||
'ie_key': 'ThePlatform',
|
||||
}
|
||||
|
||||
@ -272,13 +289,12 @@ class NBCNewsIE(ThePlatformIE):
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://www.nbcnews.com/watch/nbcnews-com/how-twitter-reacted-to-the-snowden-interview-269389891880',
|
||||
'md5': 'af1adfa51312291a017720403826bb64',
|
||||
'md5': 'cf4bc9e6ce0130f00f545d80ecedd4bf',
|
||||
'info_dict': {
|
||||
'id': '269389891880',
|
||||
'ext': 'mp4',
|
||||
'title': 'How Twitter Reacted To The Snowden Interview',
|
||||
'description': 'md5:65a0bd5d76fe114f3c2727aa3a81fe64',
|
||||
'uploader': 'NBCU-NEWS',
|
||||
'timestamp': 1401363060,
|
||||
'upload_date': '20140529',
|
||||
},
|
||||
@ -296,28 +312,26 @@ class NBCNewsIE(ThePlatformIE):
|
||||
},
|
||||
{
|
||||
'url': 'http://www.nbcnews.com/nightly-news/video/nightly-news-with-brian-williams-full-broadcast-february-4-394064451844',
|
||||
'md5': '73135a2e0ef819107bbb55a5a9b2a802',
|
||||
'md5': '8eb831eca25bfa7d25ddd83e85946548',
|
||||
'info_dict': {
|
||||
'id': '394064451844',
|
||||
'ext': 'mp4',
|
||||
'title': 'Nightly News with Brian Williams Full Broadcast (February 4)',
|
||||
'description': 'md5:1c10c1eccbe84a26e5debb4381e2d3c5',
|
||||
'timestamp': 1423104900,
|
||||
'uploader': 'NBCU-NEWS',
|
||||
'upload_date': '20150205',
|
||||
},
|
||||
},
|
||||
{
|
||||
'url': 'http://www.nbcnews.com/business/autos/volkswagen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456',
|
||||
'md5': 'a49e173825e5fcd15c13fc297fced39d',
|
||||
'md5': '4a8c4cec9e1ded51060bdda36ff0a5c0',
|
||||
'info_dict': {
|
||||
'id': '529953347624',
|
||||
'id': 'n431456',
|
||||
'ext': 'mp4',
|
||||
'title': 'Volkswagen U.S. Chief:\xa0 We Have Totally Screwed Up',
|
||||
'description': 'md5:c8be487b2d80ff0594c005add88d8351',
|
||||
'title': "Volkswagen U.S. Chief: We 'Totally Screwed Up'",
|
||||
'description': 'md5:d22d1281a24f22ea0880741bb4dd6301',
|
||||
'upload_date': '20150922',
|
||||
'timestamp': 1442917800,
|
||||
'uploader': 'NBCU-NEWS',
|
||||
},
|
||||
},
|
||||
{
|
||||
@ -330,7 +344,6 @@ class NBCNewsIE(ThePlatformIE):
|
||||
'description': 'md5:74752b7358afb99939c5f8bb2d1d04b1',
|
||||
'upload_date': '20160420',
|
||||
'timestamp': 1461152093,
|
||||
'uploader': 'NBCU-NEWS',
|
||||
},
|
||||
},
|
||||
{
|
||||
@ -344,7 +357,6 @@ class NBCNewsIE(ThePlatformIE):
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'timestamp': 1406937606,
|
||||
'upload_date': '20140802',
|
||||
'uploader': 'NBCU-NEWS',
|
||||
},
|
||||
},
|
||||
{
|
||||
@ -360,20 +372,61 @@ class NBCNewsIE(ThePlatformIE):
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
if not video_id.isdigit():
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
data = self._parse_json(self._search_regex(
|
||||
r'window\.__data\s*=\s*({.+});', webpage,
|
||||
'bootstrap json'), video_id)
|
||||
video_id = data['article']['content'][0]['primaryMedia']['video']['mpxMetadata']['id']
|
||||
data = self._parse_json(self._search_regex(
|
||||
r'window\.__data\s*=\s*({.+});', webpage,
|
||||
'bootstrap json'), video_id, js_to_json)
|
||||
video_data = try_get(data, lambda x: x['video']['current'], dict)
|
||||
if not video_data:
|
||||
video_data = data['article']['content'][0]['primaryMedia']['video']
|
||||
title = video_data['headline']['primary']
|
||||
|
||||
formats = []
|
||||
for va in video_data.get('videoAssets', []):
|
||||
public_url = va.get('publicUrl')
|
||||
if not public_url:
|
||||
continue
|
||||
if '://link.theplatform.com/' in public_url:
|
||||
public_url = update_url_query(public_url, {'format': 'redirect'})
|
||||
format_id = va.get('format')
|
||||
if format_id == 'M3U':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
public_url, video_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id=format_id, fatal=False))
|
||||
continue
|
||||
tbr = int_or_none(va.get('bitrate'), 1000)
|
||||
if tbr:
|
||||
format_id += '-%d' % tbr
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'url': public_url,
|
||||
'width': int_or_none(va.get('width')),
|
||||
'height': int_or_none(va.get('height')),
|
||||
'tbr': tbr,
|
||||
'ext': 'mp4',
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
closed_captioning = video_data.get('closedCaptioning')
|
||||
if closed_captioning:
|
||||
for cc_url in closed_captioning.values():
|
||||
if not cc_url:
|
||||
continue
|
||||
subtitles.setdefault('en', []).append({
|
||||
'url': cc_url,
|
||||
})
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
# http://feed.theplatform.com/f/2E2eJC/nbcnews also works
|
||||
'url': update_url_query('http://feed.theplatform.com/f/2E2eJC/nnd_NBCNews', {'byId': video_id}),
|
||||
'ie_key': 'ThePlatformFeed',
|
||||
'title': title,
|
||||
'description': try_get(video_data, lambda x: x['description']['primary']),
|
||||
'thumbnail': try_get(video_data, lambda x: x['primaryImage']['url']['primary']),
|
||||
'duration': parse_duration(video_data.get('duration')),
|
||||
'timestamp': unified_timestamp(video_data.get('datePublished')),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
|
||||
|
@ -108,7 +108,7 @@ class NexxIE(InfoExtractor):
|
||||
@staticmethod
|
||||
def _extract_domain_id(webpage):
|
||||
mobj = re.search(
|
||||
r'<script\b[^>]+\bsrc=["\'](?:https?:)?//require\.nexx(?:\.cloud|cdn\.com)/(?P<id>\d+)',
|
||||
r'<script\b[^>]+\bsrc=["\'](?:https?:)?//(?:require|arc)\.nexx(?:\.cloud|cdn\.com)/(?:sdk/)?(?P<id>\d+)',
|
||||
webpage)
|
||||
return mobj.group('id') if mobj else None
|
||||
|
||||
@ -123,7 +123,7 @@ class NexxIE(InfoExtractor):
|
||||
domain_id = NexxIE._extract_domain_id(webpage)
|
||||
if domain_id:
|
||||
for video_id in re.findall(
|
||||
r'(?is)onPLAYReady.+?_play\.init\s*\(.+?\s*,\s*["\']?(\d+)',
|
||||
r'(?is)onPLAYReady.+?_play\.(?:init|(?:control\.)?addPlayer)\s*\(.+?\s*,\s*["\']?(\d+)',
|
||||
webpage):
|
||||
entries.append(
|
||||
'https://api.nexx.cloud/v3/%s/videos/byid/%s'
|
||||
@ -295,13 +295,23 @@ class NexxIE(InfoExtractor):
|
||||
|
||||
video = None
|
||||
|
||||
def find_video(result):
|
||||
if isinstance(result, dict):
|
||||
return result
|
||||
elif isinstance(result, list):
|
||||
vid = int(video_id)
|
||||
for v in result:
|
||||
if try_get(v, lambda x: x['general']['ID'], int) == vid:
|
||||
return v
|
||||
return None
|
||||
|
||||
response = self._download_json(
|
||||
'https://arc.nexx.cloud/api/video/%s.json' % video_id,
|
||||
video_id, fatal=False)
|
||||
if response and isinstance(response, dict):
|
||||
result = response.get('result')
|
||||
if result and isinstance(result, dict):
|
||||
video = result
|
||||
if result:
|
||||
video = find_video(result)
|
||||
|
||||
# not all videos work via arc, e.g. nexx:741:1269984
|
||||
if not video:
|
||||
@ -348,7 +358,7 @@ class NexxIE(InfoExtractor):
|
||||
request_token = hashlib.md5(
|
||||
''.join((op, domain_id, secret)).encode('utf-8')).hexdigest()
|
||||
|
||||
video = self._call_api(
|
||||
result = self._call_api(
|
||||
domain_id, 'videos/%s/%s' % (op, video_id), video_id, data={
|
||||
'additionalfields': 'language,channel,actors,studio,licenseby,slug,subtitle,teaser,description',
|
||||
'addInteractionOptions': '1',
|
||||
@ -363,6 +373,7 @@ class NexxIE(InfoExtractor):
|
||||
'X-Request-CID': cid,
|
||||
'X-Request-Token': request_token,
|
||||
})
|
||||
video = find_video(result)
|
||||
|
||||
general = video['general']
|
||||
title = general['title']
|
||||
@ -399,8 +410,8 @@ class NexxIE(InfoExtractor):
|
||||
|
||||
|
||||
class NexxEmbedIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://embed\.nexx(?:\.cloud|cdn\.com)/\d+/(?P<id>[^/?#&]+)'
|
||||
_TEST = {
|
||||
_VALID_URL = r'https?://embed\.nexx(?:\.cloud|cdn\.com)/\d+/(?:video/)?(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://embed.nexx.cloud/748/KC1614647Z27Y7T?autoplay=1',
|
||||
'md5': '16746bfc28c42049492385c989b26c4a',
|
||||
'info_dict': {
|
||||
@ -409,7 +420,6 @@ class NexxEmbedIE(InfoExtractor):
|
||||
'title': 'Nervenkitzel Achterbahn',
|
||||
'alt_title': 'Karussellbauer in Deutschland',
|
||||
'description': 'md5:ffe7b1cc59a01f585e0569949aef73cc',
|
||||
'release_year': 2005,
|
||||
'creator': 'SPIEGEL TV',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 2761,
|
||||
@ -420,7 +430,10 @@ class NexxEmbedIE(InfoExtractor):
|
||||
'format': 'bestvideo',
|
||||
'skip_download': True,
|
||||
},
|
||||
}
|
||||
}, {
|
||||
'url': 'https://embed.nexx.cloud/11888/video/DSRTO7UVOX06S7',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
|
@ -10,6 +10,18 @@ class NhkVodIE(InfoExtractor):
|
||||
# Content available only for a limited period of time. Visit
|
||||
# https://www3.nhk.or.jp/nhkworld/en/ondemand/ for working samples.
|
||||
_TESTS = [{
|
||||
# clip
|
||||
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999011/',
|
||||
'md5': '256a1be14f48d960a7e61e2532d95ec3',
|
||||
'info_dict': {
|
||||
'id': 'a95j5iza',
|
||||
'ext': 'mp4',
|
||||
'title': "Dining with the Chef - Chef Saito's Family recipe: MENCHI-KATSU",
|
||||
'description': 'md5:5aee4a9f9d81c26281862382103b0ea5',
|
||||
'timestamp': 1565965194,
|
||||
'upload_date': '20190816',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2015173/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
@ -19,7 +31,7 @@ class NhkVodIE(InfoExtractor):
|
||||
'url': 'https://www3.nhk.or.jp/nhkworld/fr/ondemand/audio/plugin-20190404-1/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_API_URL_TEMPLATE = 'https://api.nhk.or.jp/nhkworld/%sodesdlist/v7/episode/%s/%s/all%s.json'
|
||||
_API_URL_TEMPLATE = 'https://api.nhk.or.jp/nhkworld/%sod%slist/v7/episode/%s/%s/all%s.json'
|
||||
|
||||
def _real_extract(self, url):
|
||||
lang, m_type, episode_id = re.match(self._VALID_URL, url).groups()
|
||||
@ -28,7 +40,10 @@ class NhkVodIE(InfoExtractor):
|
||||
|
||||
is_video = m_type == 'video'
|
||||
episode = self._download_json(
|
||||
self._API_URL_TEMPLATE % ('v' if is_video else 'r', episode_id, lang, '/all' if is_video else ''),
|
||||
self._API_URL_TEMPLATE % (
|
||||
'v' if is_video else 'r',
|
||||
'clip' if episode_id[:4] == '9999' else 'esd',
|
||||
episode_id, lang, '/all' if is_video else ''),
|
||||
episode_id, query={'apikey': 'EJfK8jdS57GqlupFgAfAAwr573q01y6k'})['data']['episodes'][0]
|
||||
title = episode.get('sub_title_clean') or episode['sub_title']
|
||||
|
||||
@ -60,8 +75,8 @@ class NhkVodIE(InfoExtractor):
|
||||
if is_video:
|
||||
info.update({
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'Ooyala',
|
||||
'url': 'ooyala:' + episode['vod_id'],
|
||||
'ie_key': 'Piksel',
|
||||
'url': 'https://player.piksel.com/v/refid/nhkworld/prefid/' + episode['vod_id'],
|
||||
})
|
||||
else:
|
||||
audio = episode['audio']
|
||||
|
@ -25,9 +25,14 @@ class NonkTubeIE(NuevoBaseIE):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
info = self._extract_nuevo(
|
||||
'https://www.nonktube.com/media/nuevo/econfig.php?key=%s'
|
||||
% video_id, video_id)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
info['age_limit'] = 18
|
||||
title = self._og_search_title(webpage)
|
||||
info = self._parse_html5_media_entries(url, webpage, video_id)[0]
|
||||
|
||||
info.update({
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'age_limit': 18,
|
||||
})
|
||||
return info
|
||||
|
@ -406,7 +406,7 @@ class NRKTVSerieBaseIE(InfoExtractor):
|
||||
def _extract_series(self, webpage, display_id, fatal=True):
|
||||
config = self._parse_json(
|
||||
self._search_regex(
|
||||
(r'INITIAL_DATA_*\s*=\s*({.+?})\s*;',
|
||||
(r'INITIAL_DATA(?:_V\d)?_*\s*=\s*({.+?})\s*;',
|
||||
r'({.+?})\s*,\s*"[^"]+"\s*\)\s*</script>'),
|
||||
webpage, 'config', default='{}' if not fatal else NO_DEFAULT),
|
||||
display_id, fatal=False)
|
||||
|
@ -1,6 +1,8 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_etree_fromstring,
|
||||
@ -121,6 +123,13 @@ class OdnoklassnikiIE(InfoExtractor):
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
def _extract_url(webpage):
|
||||
mobj = re.search(
|
||||
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:odnoklassniki|ok)\.ru/videoembed/.+?)\1', webpage)
|
||||
if mobj:
|
||||
return mobj.group('url')
|
||||
|
||||
def _real_extract(self, url):
|
||||
start_time = int_or_none(compat_parse_qs(
|
||||
compat_urllib_parse_urlparse(url).query).get('fromTime', [None])[0])
|
||||
|
@ -20,6 +20,8 @@ from ..utils import (
|
||||
|
||||
|
||||
class OnetBaseIE(InfoExtractor):
|
||||
_URL_BASE_RE = r'https?://(?:(?:www\.)?onet\.tv|onet100\.vod\.pl)/[a-z]/'
|
||||
|
||||
def _search_mvp_id(self, webpage):
|
||||
return self._search_regex(
|
||||
r'id=(["\'])mvp:(?P<id>.+?)\1', webpage, 'mvp id', group='id')
|
||||
@ -45,7 +47,7 @@ class OnetBaseIE(InfoExtractor):
|
||||
video = response['result'].get('0')
|
||||
|
||||
formats = []
|
||||
for _, formats_dict in video['formats'].items():
|
||||
for format_type, formats_dict in video['formats'].items():
|
||||
if not isinstance(formats_dict, dict):
|
||||
continue
|
||||
for format_id, format_list in formats_dict.items():
|
||||
@ -56,21 +58,31 @@ class OnetBaseIE(InfoExtractor):
|
||||
if not video_url:
|
||||
continue
|
||||
ext = determine_ext(video_url)
|
||||
if format_id == 'ism':
|
||||
if format_id.startswith('ism'):
|
||||
formats.extend(self._extract_ism_formats(
|
||||
video_url, video_id, 'mss', fatal=False))
|
||||
elif ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
video_url, video_id, mpd_id='dash', fatal=False))
|
||||
elif format_id.startswith('hls'):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_url, video_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
http_f = {
|
||||
'url': video_url,
|
||||
'format_id': format_id,
|
||||
'height': int_or_none(f.get('vertical_resolution')),
|
||||
'width': int_or_none(f.get('horizontal_resolution')),
|
||||
'abr': float_or_none(f.get('audio_bitrate')),
|
||||
'vbr': float_or_none(f.get('video_bitrate')),
|
||||
})
|
||||
}
|
||||
if format_type == 'audio':
|
||||
http_f['vcodec'] = 'none'
|
||||
else:
|
||||
http_f.update({
|
||||
'height': int_or_none(f.get('vertical_resolution')),
|
||||
'width': int_or_none(f.get('horizontal_resolution')),
|
||||
'vbr': float_or_none(f.get('video_bitrate')),
|
||||
})
|
||||
formats.append(http_f)
|
||||
self._sort_formats(formats)
|
||||
|
||||
meta = video.get('meta', {})
|
||||
@ -105,12 +117,12 @@ class OnetMVPIE(OnetBaseIE):
|
||||
|
||||
|
||||
class OnetIE(OnetBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?onet\.tv/[a-z]/[a-z]+/(?P<display_id>[0-9a-z-]+)/(?P<id>[0-9a-z]+)'
|
||||
_VALID_URL = OnetBaseIE._URL_BASE_RE + r'[a-z]+/(?P<display_id>[0-9a-z-]+)/(?P<id>[0-9a-z]+)'
|
||||
IE_NAME = 'onet.tv'
|
||||
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://onet.tv/k/openerfestival/open-er-festival-2016-najdziwniejsze-wymagania-gwiazd/qbpyqc',
|
||||
'md5': 'e3ffbf47590032ac3f27249204173d50',
|
||||
'md5': '436102770fb095c75b8bb0392d3da9ff',
|
||||
'info_dict': {
|
||||
'id': 'qbpyqc',
|
||||
'display_id': 'open-er-festival-2016-najdziwniejsze-wymagania-gwiazd',
|
||||
@ -120,7 +132,10 @@ class OnetIE(OnetBaseIE):
|
||||
'upload_date': '20160705',
|
||||
'timestamp': 1467721580,
|
||||
},
|
||||
}
|
||||
}, {
|
||||
'url': 'https://onet100.vod.pl/k/openerfestival/open-er-festival-2016-najdziwniejsze-wymagania-gwiazd/qbpyqc',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
@ -140,18 +155,21 @@ class OnetIE(OnetBaseIE):
|
||||
|
||||
|
||||
class OnetChannelIE(OnetBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?onet\.tv/[a-z]/(?P<id>[a-z]+)(?:[?#]|$)'
|
||||
_VALID_URL = OnetBaseIE._URL_BASE_RE + r'(?P<id>[a-z]+)(?:[?#]|$)'
|
||||
IE_NAME = 'onet.tv:channel'
|
||||
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://onet.tv/k/openerfestival',
|
||||
'info_dict': {
|
||||
'id': 'openerfestival',
|
||||
'title': 'Open\'er Festival Live',
|
||||
'description': 'Dziękujemy, że oglądaliście transmisje. Zobaczcie nasze relacje i wywiady z artystami.',
|
||||
'title': "Open'er Festival",
|
||||
'description': "Tak było na Open'er Festival 2016! Oglądaj nasze reportaże i wywiady z artystami.",
|
||||
},
|
||||
'playlist_mincount': 46,
|
||||
}
|
||||
'playlist_mincount': 35,
|
||||
}, {
|
||||
'url': 'https://onet100.vod.pl/k/openerfestival',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
channel_id = self._match_id(url)
|
||||
@ -173,7 +191,7 @@ class OnetChannelIE(OnetBaseIE):
|
||||
'Downloading channel %s - add --no-playlist to just download video %s' % (
|
||||
channel_id, video_name))
|
||||
matches = re.findall(
|
||||
r'<a[^>]+href=[\'"](https?://(?:www\.)?onet\.tv/[a-z]/[a-z]+/[0-9a-z-]+/[0-9a-z]+)',
|
||||
r'<a[^>]+href=[\'"](%s[a-z]+/[0-9a-z-]+/[0-9a-z]+)' % self._URL_BASE_RE,
|
||||
webpage)
|
||||
entries = [
|
||||
self.url_result(video_link, OnetIE.ie_key())
|
||||
|
@ -4,12 +4,8 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
float_or_none,
|
||||
mimetype2ext,
|
||||
)
|
||||
from ..compat import compat_str
|
||||
from ..utils import js_to_json
|
||||
|
||||
|
||||
class OnionStudiosIE(InfoExtractor):
|
||||
@ -17,14 +13,16 @@ class OnionStudiosIE(InfoExtractor):
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.onionstudios.com/videos/hannibal-charges-forward-stops-for-a-cocktail-2937',
|
||||
'md5': '719d1f8c32094b8c33902c17bcae5e34',
|
||||
'md5': '5a118d466d62b5cd03647cf2c593977f',
|
||||
'info_dict': {
|
||||
'id': '2937',
|
||||
'id': '3459881',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hannibal charges forward, stops for a cocktail',
|
||||
'description': 'md5:545299bda6abf87e5ec666548c6a9448',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'uploader': 'The A.V. Club',
|
||||
'uploader_id': 'the-av-club',
|
||||
'uploader': 'a.v. club',
|
||||
'upload_date': '20150619',
|
||||
'timestamp': 1434728546,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.onionstudios.com/embed?id=2855&autoplay=true',
|
||||
@ -44,38 +42,12 @@ class OnionStudiosIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
video_data = self._download_json(
|
||||
'http://www.onionstudios.com/video/%s.json' % video_id, video_id)
|
||||
|
||||
title = video_data['title']
|
||||
|
||||
formats = []
|
||||
for source in video_data.get('sources', []):
|
||||
source_url = source.get('url')
|
||||
if not source_url:
|
||||
continue
|
||||
ext = mimetype2ext(source.get('content_type')) or determine_ext(source_url)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
tbr = int_or_none(source.get('bitrate'))
|
||||
formats.append({
|
||||
'format_id': ext + ('-%d' % tbr if tbr else ''),
|
||||
'url': source_url,
|
||||
'width': int_or_none(source.get('width')),
|
||||
'tbr': tbr,
|
||||
'ext': ext,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': video_data.get('poster_url'),
|
||||
'uploader': video_data.get('channel_name'),
|
||||
'uploader_id': video_data.get('channel_slug'),
|
||||
'duration': float_or_none(video_data.get('duration', 1000)),
|
||||
'tags': video_data.get('tags'),
|
||||
'formats': formats,
|
||||
}
|
||||
webpage = self._download_webpage(
|
||||
'http://onionstudios.com/embed/dc94dc2899fe644c0e7241fa04c1b732.js',
|
||||
video_id)
|
||||
mcp_id = compat_str(self._parse_json(self._search_regex(
|
||||
r'window\.mcpMapping\s*=\s*({.+?});', webpage,
|
||||
'MCP Mapping'), video_id, js_to_json)[video_id]['mcp_id'])
|
||||
return self.url_result(
|
||||
'http://kinja.com/ajax/inset/iframe?id=mcp-' + mcp_id,
|
||||
'KinjaEmbed', mcp_id)
|
||||
|
@ -246,7 +246,7 @@ class OpenloadIE(InfoExtractor):
|
||||
_DOMAINS = r'''
|
||||
(?:
|
||||
openload\.(?:co|io|link|pw)|
|
||||
oload\.(?:tv|best|biz|stream|site|xyz|win|download|cloud|cc|icu|fun|club|info|press|pw|life|live|space|services|website|vip)|
|
||||
oload\.(?:tv|best|biz|stream|site|xyz|win|download|cloud|cc|icu|fun|club|info|online|monster|press|pw|life|live|space|services|website|vip)|
|
||||
oladblock\.(?:services|xyz|me)|openloed\.co
|
||||
)
|
||||
'''
|
||||
@ -362,6 +362,12 @@ class OpenloadIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://oload.services/embed/bs1NWj1dCag/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://oload.online/f/W8o2UfN1vNY/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://oload.monster/f/W8o2UfN1vNY/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://oload.press/embed/drTBl1aOTvk/',
|
||||
'only_matching': True,
|
||||
|
@ -86,12 +86,13 @@ class ORFTVthekIE(InfoExtractor):
|
||||
if value:
|
||||
format_id_list.append(value)
|
||||
format_id = '-'.join(format_id_list)
|
||||
if determine_ext(fd['src']) == 'm3u8':
|
||||
ext = determine_ext(src)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
fd['src'], video_id, 'mp4', m3u8_id=format_id))
|
||||
elif determine_ext(fd['src']) == 'f4m':
|
||||
src, video_id, 'mp4', m3u8_id=format_id, fatal=False))
|
||||
elif ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
fd['src'], video_id, f4m_id=format_id))
|
||||
src, video_id, f4m_id=format_id, fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
|
@ -6,7 +6,11 @@ from ..utils import (
|
||||
clean_html,
|
||||
determine_ext,
|
||||
int_or_none,
|
||||
KNOWN_EXTENSIONS,
|
||||
mimetype2ext,
|
||||
parse_iso8601,
|
||||
str_or_none,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
@ -24,6 +28,7 @@ class PatreonIE(InfoExtractor):
|
||||
'thumbnail': 're:^https?://.*$',
|
||||
'timestamp': 1406473987,
|
||||
'upload_date': '20140727',
|
||||
'uploader_id': '87145',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.patreon.com/creation?hid=754133',
|
||||
@ -90,7 +95,13 @@ class PatreonIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
post = self._download_json(
|
||||
'https://www.patreon.com/api/posts/' + video_id, video_id)
|
||||
'https://www.patreon.com/api/posts/' + video_id, video_id, query={
|
||||
'fields[media]': 'download_url,mimetype,size_bytes',
|
||||
'fields[post]': 'comment_count,content,embed,image,like_count,post_file,published_at,title',
|
||||
'fields[user]': 'full_name,url',
|
||||
'json-api-use-default-includes': 'false',
|
||||
'include': 'media,user',
|
||||
})
|
||||
attributes = post['data']['attributes']
|
||||
title = attributes['title'].strip()
|
||||
image = attributes.get('image') or {}
|
||||
@ -104,33 +115,42 @@ class PatreonIE(InfoExtractor):
|
||||
'comment_count': int_or_none(attributes.get('comment_count')),
|
||||
}
|
||||
|
||||
def add_file(file_data):
|
||||
file_url = file_data.get('url')
|
||||
if file_url:
|
||||
info.update({
|
||||
'url': file_url,
|
||||
'ext': determine_ext(file_data.get('name'), 'mp3'),
|
||||
})
|
||||
|
||||
for i in post.get('included', []):
|
||||
i_type = i.get('type')
|
||||
if i_type == 'attachment':
|
||||
add_file(i.get('attributes') or {})
|
||||
if i_type == 'media':
|
||||
media_attributes = i.get('attributes') or {}
|
||||
download_url = media_attributes.get('download_url')
|
||||
ext = mimetype2ext(media_attributes.get('mimetype'))
|
||||
if download_url and ext in KNOWN_EXTENSIONS:
|
||||
info.update({
|
||||
'ext': ext,
|
||||
'filesize': int_or_none(media_attributes.get('size_bytes')),
|
||||
'url': download_url,
|
||||
})
|
||||
elif i_type == 'user':
|
||||
user_attributes = i.get('attributes')
|
||||
if user_attributes:
|
||||
info.update({
|
||||
'uploader': user_attributes.get('full_name'),
|
||||
'uploader_id': str_or_none(i.get('id')),
|
||||
'uploader_url': user_attributes.get('url'),
|
||||
})
|
||||
|
||||
if not info.get('url'):
|
||||
add_file(attributes.get('post_file') or {})
|
||||
embed_url = try_get(attributes, lambda x: x['embed']['url'])
|
||||
if embed_url:
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': embed_url,
|
||||
})
|
||||
|
||||
if not info.get('url'):
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': attributes['embed']['url'],
|
||||
})
|
||||
post_file = attributes['post_file']
|
||||
ext = determine_ext(post_file.get('name'))
|
||||
if ext in KNOWN_EXTENSIONS:
|
||||
info.update({
|
||||
'ext': ext,
|
||||
'url': post_file['url'],
|
||||
})
|
||||
|
||||
return info
|
||||
|
@ -18,81 +18,385 @@ from ..utils import (
|
||||
class PeerTubeIE(InfoExtractor):
|
||||
_INSTANCES_RE = r'''(?:
|
||||
# Taken from https://instances.joinpeertube.org/instances
|
||||
peertube\.rainbowswingers\.net|
|
||||
tube\.stanisic\.nl|
|
||||
peer\.suiri\.us|
|
||||
medias\.libox\.fr|
|
||||
videomensoif\.ynh\.fr|
|
||||
peertube\.travelpandas\.eu|
|
||||
peertube\.rachetjay\.fr|
|
||||
peertube\.montecsys\.fr|
|
||||
tube\.eskuero\.me|
|
||||
peer\.tube|
|
||||
peertube\.umeahackerspace\.se|
|
||||
tube\.nx-pod\.de|
|
||||
video\.monsieurbidouille\.fr|
|
||||
tube\.openalgeria\.org|
|
||||
peertube\.pointsecu\.fr|
|
||||
vid\.lelux\.fi|
|
||||
video\.anormallostpod\.ovh|
|
||||
tube\.crapaud-fou\.org|
|
||||
peertube\.stemy\.me|
|
||||
lostpod\.space|
|
||||
exode\.me|
|
||||
peertube\.snargol\.com|
|
||||
vis\.ion\.ovh|
|
||||
videosdulib\.re|
|
||||
v\.mbius\.io|
|
||||
videos\.judrey\.eu|
|
||||
peertube\.osureplayviewer\.xyz|
|
||||
peertube\.mathieufamily\.ovh|
|
||||
www\.videos-libr\.es|
|
||||
fightforinfo\.com|
|
||||
peertube\.fediverse\.ru|
|
||||
peertube\.oiseauroch\.fr|
|
||||
video\.nesven\.eu|
|
||||
v\.bearvideo\.win|
|
||||
video\.qoto\.org|
|
||||
justporn\.cc|
|
||||
video\.vny\.fr|
|
||||
peervideo\.club|
|
||||
tube\.taker\.fr|
|
||||
peertube\.chantierlibre\.org|
|
||||
tube\.ipfixe\.info|
|
||||
tube\.kicou\.info|
|
||||
tube\.dodsorf\.as|
|
||||
videobit\.cc|
|
||||
video\.yukari\.moe|
|
||||
videos\.elbinario\.net|
|
||||
hkvideo\.live|
|
||||
pt\.tux\.tf|
|
||||
www\.hkvideo\.live|
|
||||
FIGHTFORINFO\.com|
|
||||
pt\.765racing\.com|
|
||||
peertube\.gnumeria\.eu\.org|
|
||||
nordenmedia\.com|
|
||||
peertube\.co\.uk|
|
||||
tube\.darfweb\.eu|
|
||||
tube\.kalah-france\.org|
|
||||
0ch\.in|
|
||||
vod\.mochi\.academy|
|
||||
film\.node9\.org|
|
||||
peertube\.hatthieves\.es|
|
||||
video\.fitchfamily\.org|
|
||||
peertube\.ddns\.net|
|
||||
video\.ifuncle\.kr|
|
||||
video\.fdlibre\.eu|
|
||||
tube\.22decembre\.eu|
|
||||
peertube\.harmoniescreatives\.com|
|
||||
tube\.fabrigli\.fr|
|
||||
video\.thedwyers\.co|
|
||||
video\.bruitbruit\.com|
|
||||
peertube\.foxfam\.club|
|
||||
peer\.philoxweb\.be|
|
||||
videos\.bugs\.social|
|
||||
peertube\.malbert\.xyz|
|
||||
peertube\.bilange\.ca|
|
||||
libretube\.net|
|
||||
diytelevision\.com|
|
||||
peertube\.fedilab\.app|
|
||||
libre\.video|
|
||||
video\.mstddntfdn\.online|
|
||||
us\.tv|
|
||||
peertube\.sl-network\.fr|
|
||||
peertube\.dynlinux\.io|
|
||||
peertube\.david\.durieux\.family|
|
||||
peertube\.linuxrocks\.online|
|
||||
peerwatch\.xyz|
|
||||
v\.kretschmann\.social|
|
||||
tube\.otter\.sh|
|
||||
yt\.is\.nota\.live|
|
||||
tube\.dragonpsi\.xyz|
|
||||
peertube\.boneheadmedia\.com|
|
||||
videos\.funkwhale\.audio|
|
||||
watch\.44con\.com|
|
||||
peertube\.gcaillaut\.fr|
|
||||
peertube\.icu|
|
||||
pony\.tube|
|
||||
spacepub\.space|
|
||||
tube\.stbr\.io|
|
||||
v\.mom-gay\.faith|
|
||||
tube\.port0\.xyz|
|
||||
peertube\.simounet\.net|
|
||||
play\.jergefelt\.se|
|
||||
peertube\.zeteo\.me|
|
||||
tube\.danq\.me|
|
||||
peertube\.kerenon\.com|
|
||||
tube\.fab-l3\.org|
|
||||
tube\.calculate\.social|
|
||||
peertube\.mckillop\.org|
|
||||
tube\.netzspielplatz\.de|
|
||||
vod\.ksite\.de|
|
||||
peertube\.laas\.fr|
|
||||
tube\.govital\.net|
|
||||
peertube\.stephenson\.cc|
|
||||
bistule\.nohost\.me|
|
||||
peertube\.kajalinifi\.de|
|
||||
video\.ploud\.jp|
|
||||
video\.omniatv\.com|
|
||||
peertube\.ffs2play\.fr|
|
||||
peertube\.leboulaire\.ovh|
|
||||
peertube\.tronic-studio\.com|
|
||||
peertube\.public\.cat|
|
||||
peertube\.metalbanana\.net|
|
||||
video\.1000i100\.fr|
|
||||
peertube\.alter-nativ-voll\.de|
|
||||
tube\.pasa\.tf|
|
||||
tube\.worldofhauru\.xyz|
|
||||
pt\.kamp\.site|
|
||||
peertube\.teleassist\.fr|
|
||||
videos\.mleduc\.xyz|
|
||||
conf\.tube|
|
||||
media\.privacyinternational\.org|
|
||||
pt\.forty-two\.nl|
|
||||
video\.halle-leaks\.de|
|
||||
video\.grosskopfgames\.de|
|
||||
peertube\.schaeferit\.de|
|
||||
peertube\.jackbot\.fr|
|
||||
tube\.extinctionrebellion\.fr|
|
||||
peertube\.f-si\.org|
|
||||
video\.subak\.ovh|
|
||||
videos\.koweb\.fr|
|
||||
peertube\.zergy\.net|
|
||||
peertube\.roflcopter\.fr|
|
||||
peertube\.floss-marketing-school\.com|
|
||||
vloggers\.social|
|
||||
peertube\.iriseden\.eu|
|
||||
videos\.ubuntu-paris\.org|
|
||||
peertube\.mastodon\.host|
|
||||
armstube\.com|
|
||||
peertube\.s2s\.video|
|
||||
peertube\.lol|
|
||||
tube\.open-plug\.eu|
|
||||
open\.tube|
|
||||
peertube\.ch|
|
||||
peertube\.normandie-libre\.fr|
|
||||
peertube\.slat\.org|
|
||||
video\.lacaveatonton\.ovh|
|
||||
peertube\.uno|
|
||||
peertube\.servebeer\.com|
|
||||
peertube\.fedi\.quebec|
|
||||
tube\.h3z\.jp|
|
||||
tube\.plus200\.com|
|
||||
peertube\.eric\.ovh|
|
||||
tube\.metadocs\.cc|
|
||||
tube\.unmondemeilleur\.eu|
|
||||
gouttedeau\.space|
|
||||
video\.antirep\.net|
|
||||
nrop\.cant\.at|
|
||||
tube\.ksl-bmx\.de|
|
||||
tube\.plaf\.fr|
|
||||
tube\.tchncs\.de|
|
||||
video\.devinberg\.com|
|
||||
hitchtube\.fr|
|
||||
peertube\.kosebamse\.com|
|
||||
yunopeertube\.myddns\.me|
|
||||
peertube\.varney\.fr|
|
||||
peertube\.anon-kenkai\.com|
|
||||
tube\.maiti\.info|
|
||||
tubee\.fr|
|
||||
videos\.dinofly\.com|
|
||||
toobnix\.org|
|
||||
videotape\.me|
|
||||
voca\.tube|
|
||||
video\.heromuster\.com|
|
||||
video\.lemediatv\.fr|
|
||||
video\.up\.edu\.ph|
|
||||
balafon\.video|
|
||||
video\.ivel\.fr|
|
||||
thickrips\.cloud|
|
||||
pt\.laurentkruger\.fr|
|
||||
video\.monarch-pass\.net|
|
||||
peertube\.artica\.center|
|
||||
video\.alternanet\.fr|
|
||||
indymotion\.fr|
|
||||
fanvid\.stopthatimp\.net|
|
||||
video\.farci\.org|
|
||||
v\.lesterpig\.com|
|
||||
video\.okaris\.de|
|
||||
tube\.pawelko\.net|
|
||||
peertube\.mablr\.org|
|
||||
tube\.fede\.re|
|
||||
pytu\.be|
|
||||
evertron\.tv|
|
||||
devtube\.dev-wiki\.de|
|
||||
raptube\.antipub\.org|
|
||||
video\.selea\.se|
|
||||
peertube\.mygaia\.org|
|
||||
video\.oh14\.de|
|
||||
peertube\.livingutopia\.org|
|
||||
peertube\.the-penguin\.de|
|
||||
tube\.thechangebook\.org|
|
||||
tube\.anjara\.eu|
|
||||
pt\.pube\.tk|
|
||||
video\.samedi\.pm|
|
||||
mplayer\.demouliere\.eu|
|
||||
widemus\.de|
|
||||
peertube\.me|
|
||||
peertube\.zapashcanon\.fr|
|
||||
video\.latavernedejohnjohn\.fr|
|
||||
peertube\.pcservice46\.fr|
|
||||
peertube\.mazzonetto\.eu|
|
||||
video\.irem\.univ-paris-diderot\.fr|
|
||||
video\.livecchi\.cloud|
|
||||
alttube\.fr|
|
||||
video\.coop\.tools|
|
||||
video\.cabane-libre\.org|
|
||||
peertube\.openstreetmap\.fr|
|
||||
videos\.alolise\.org|
|
||||
irrsinn\.video|
|
||||
video\.antopie\.org|
|
||||
scitech\.video|
|
||||
tube2\.nemsia\.org|
|
||||
video\.amic37\.fr|
|
||||
peertube\.freeforge\.eu|
|
||||
video\.arbitrarion\.com|
|
||||
video\.datsemultimedia\.com|
|
||||
stoptrackingus\.tv|
|
||||
peertube\.ricostrongxxx\.com|
|
||||
docker\.videos\.lecygnenoir\.info|
|
||||
peertube\.togart\.de|
|
||||
tube\.postblue\.info|
|
||||
videos\.domainepublic\.net|
|
||||
peertube\.cyber-tribal\.com|
|
||||
video\.gresille\.org|
|
||||
peertube\.dsmouse\.net|
|
||||
cinema\.yunohost\.support|
|
||||
tube\.theocevaer\.fr|
|
||||
repro\.video|
|
||||
tube\.4aem\.com|
|
||||
quaziinc\.com|
|
||||
peertube\.metawurst\.space|
|
||||
videos\.wakapo\.com|
|
||||
video\.ploud\.fr|
|
||||
video\.freeradical\.zone|
|
||||
tube\.valinor\.fr|
|
||||
refuznik\.video|
|
||||
pt\.kircheneuenburg\.de|
|
||||
peertube\.asrun\.eu|
|
||||
peertube\.lagob\.fr|
|
||||
videos\.side-ways\.net|
|
||||
91video\.online|
|
||||
video\.valme\.io|
|
||||
video\.taboulisme\.com|
|
||||
videos-libr\.es|
|
||||
tv\.mooh\.fr|
|
||||
nuage\.acostey\.fr|
|
||||
video\.monsieur-a\.fr|
|
||||
peertube\.librelois\.fr|
|
||||
videos\.pair2jeux\.tube|
|
||||
videos\.pueseso\.club|
|
||||
peer\.mathdacloud\.ovh|
|
||||
media\.assassinate-you\.net|
|
||||
vidcommons\.org|
|
||||
ptube\.rousset\.nom\.fr|
|
||||
tube\.cyano\.at|
|
||||
videos\.squat\.net|
|
||||
video\.iphodase\.fr|
|
||||
peertube\.makotoworkshop\.org|
|
||||
peertube\.serveur\.slv-valbonne\.fr|
|
||||
vault\.mle\.party|
|
||||
hostyour\.tv|
|
||||
videos\.hack2g2\.fr|
|
||||
libre\.tube|
|
||||
pire\.artisanlogiciel\.net|
|
||||
videos\.numerique-en-commun\.fr|
|
||||
video\.netsyms\.com|
|
||||
video\.die-partei\.social|
|
||||
video\.writeas\.org|
|
||||
peertube\.swarm\.solvingmaz\.es|
|
||||
tube\.pericoloso\.ovh|
|
||||
watching\.cypherpunk\.observer|
|
||||
videos\.adhocmusic\.com|
|
||||
tube\.rfc1149\.net|
|
||||
peertube\.librelabucm\.org|
|
||||
videos\.numericoop\.fr|
|
||||
peertube\.koehn\.com|
|
||||
peertube\.anarchmusicall\.net|
|
||||
tube\.kampftoast\.de|
|
||||
vid\.y-y\.li|
|
||||
peertube\.xtenz\.xyz|
|
||||
diode\.zone|
|
||||
tube\.egf\.mn|
|
||||
peertube\.nomagic\.uk|
|
||||
visionon\.tv|
|
||||
videos\.koumoul\.com|
|
||||
video\.rastapuls\.com|
|
||||
video\.mantlepro\.com|
|
||||
video\.deadsuperhero\.com|
|
||||
peertube\.musicstudio\.pro|
|
||||
peertube\.we-keys\.fr|
|
||||
artitube\.artifaille\.fr|
|
||||
peertube\.ethernia\.net|
|
||||
tube\.midov\.pl|
|
||||
peertube\.fr|
|
||||
watch\.snoot\.tube|
|
||||
peertube\.donnadieu\.fr|
|
||||
argos\.aquilenet\.fr|
|
||||
tube\.nemsia\.org|
|
||||
tube\.bruniau\.net|
|
||||
videos\.darckoune\.moe|
|
||||
tube\.traydent\.info|
|
||||
dev\.videos\.lecygnenoir\.info|
|
||||
peertube\.nayya\.org|
|
||||
peertube\.live|
|
||||
peertube\.mofgao\.space|
|
||||
video\.lequerrec\.eu|
|
||||
peertube\.amicale\.net|
|
||||
aperi\.tube|
|
||||
tube\.ac-lyon\.fr|
|
||||
video\.lw1\.at|
|
||||
www\.yiny\.org|
|
||||
videos\.pofilo\.fr|
|
||||
tube\.lou\.lt|
|
||||
choob\.h\.etbus\.ch|
|
||||
tube\.hoga\.fr|
|
||||
peertube\.heberge\.fr|
|
||||
video\.obermui\.de|
|
||||
videos\.cloudfrancois\.fr|
|
||||
betamax\.video|
|
||||
video\.typica\.us|
|
||||
tube\.piweb\.be|
|
||||
video\.blender\.org|
|
||||
peertube\.cat|
|
||||
tube\.kdy\.ch|
|
||||
pe\.ertu\.be|
|
||||
peertube\.social|
|
||||
videos\.lescommuns\.org|
|
||||
tv\.datamol\.org|
|
||||
videonaute\.fr|
|
||||
dialup\.express|
|
||||
peertube\.nogafa\.org|
|
||||
peertube\.pl|
|
||||
megatube\.lilomoino\.fr|
|
||||
peertube\.tamanoir\.foucry\.net|
|
||||
peertube\.inapurna\.org|
|
||||
peertube\.netzspielplatz\.de|
|
||||
video\.deadsuperhero\.com|
|
||||
peertube\.devosi\.org|
|
||||
peertube\.1312\.media|
|
||||
tube\.worldofhauru\.xyz|
|
||||
tube\.bootlicker\.party|
|
||||
skeptikon\.fr|
|
||||
peertube\.geekshell\.fr|
|
||||
tube\.opportunis\.me|
|
||||
peertube\.peshane\.net|
|
||||
video\.blueline\.mg|
|
||||
tube\.homecomputing\.fr|
|
||||
videos\.cloudfrancois\.fr|
|
||||
peertube\.viviers-fibre\.net|
|
||||
tube\.ouahpiti\.info|
|
||||
video\.tedomum\.net|
|
||||
video\.g3l\.org|
|
||||
fontube\.fr|
|
||||
peertube\.gaialabs\.ch|
|
||||
peertube\.extremely\.online|
|
||||
peertube\.public-infrastructure\.eu|
|
||||
tube\.kher\.nl|
|
||||
peertube\.qtg\.fr|
|
||||
tube\.22decembre\.eu|
|
||||
facegirl\.me|
|
||||
video\.migennes\.net|
|
||||
janny\.moe|
|
||||
tube\.p2p\.legal|
|
||||
video\.atlanti\.se|
|
||||
troll\.tv|
|
||||
peertube\.geekael\.fr|
|
||||
vid\.leotindall\.com|
|
||||
video\.anormallostpod\.ovh|
|
||||
p-tube\.h3z\.jp|
|
||||
tube\.darfweb\.eu|
|
||||
videos\.iut-orsay\.fr|
|
||||
peertube\.solidev\.net|
|
||||
videos\.symphonie-of-code\.fr|
|
||||
testtube\.ortg\.de|
|
||||
videos\.cemea\.org|
|
||||
peertube\.gwendalavir\.eu|
|
||||
video\.passageenseine\.fr|
|
||||
videos\.festivalparminous\.org|
|
||||
peertube\.touhoppai\.moe|
|
||||
peertube\.duckdns\.org|
|
||||
sikke\.fi|
|
||||
peertube\.mastodon\.host|
|
||||
firedragonvideos\.com|
|
||||
vidz\.dou\.bet|
|
||||
peertube\.koehn\.com|
|
||||
peer\.hostux\.social|
|
||||
share\.tube|
|
||||
peertube\.walkingmountains\.fr|
|
||||
medias\.libox\.fr|
|
||||
peertube\.moe|
|
||||
peertube\.xyz|
|
||||
jp\.peertube\.network|
|
||||
videos\.benpro\.fr|
|
||||
tube\.otter\.sh|
|
||||
peertube\.angristan\.xyz|
|
||||
peertube\.parleur\.net|
|
||||
peer\.ecutsa\.fr|
|
||||
peertube\.heraut\.eu|
|
||||
peertube\.tifox\.fr|
|
||||
peertube\.maly\.io|
|
||||
vod\.mochi\.academy|
|
||||
exode\.me|
|
||||
coste\.video|
|
||||
tube\.aquilenet\.fr|
|
||||
peertube\.gegeweb\.eu|
|
||||
framatube\.org|
|
||||
@ -100,18 +404,11 @@ class PeerTubeIE(InfoExtractor):
|
||||
tube\.conferences-gesticulees\.net|
|
||||
peertube\.datagueule\.tv|
|
||||
video\.lqdn\.fr|
|
||||
meilleurtube\.delire\.party|
|
||||
tube\.mochi\.academy|
|
||||
peertube\.dav\.li|
|
||||
media\.zat\.im|
|
||||
pytu\.be|
|
||||
peertube\.valvin\.fr|
|
||||
peertube\.nsa\.ovh|
|
||||
video\.colibris-outilslibres\.org|
|
||||
video\.hispagatos\.org|
|
||||
tube\.svnet\.fr|
|
||||
peertube\.video|
|
||||
videos\.lecygnenoir\.info|
|
||||
peertube3\.cpy\.re|
|
||||
peertube2\.cpy\.re|
|
||||
videos\.tcit\.fr|
|
||||
@ -126,7 +423,7 @@ class PeerTubeIE(InfoExtractor):
|
||||
(?P<id>%s)
|
||||
''' % (_INSTANCES_RE, _UUID_RE)
|
||||
_TESTS = [{
|
||||
'url': 'https://peertube.moe/videos/watch/2790feb0-8120-4e63-9af3-c943c69f5e6c',
|
||||
'url': 'https://peertube.cpy.re/videos/watch/2790feb0-8120-4e63-9af3-c943c69f5e6c',
|
||||
'md5': '80f24ff364cc9d333529506a263e7feb',
|
||||
'info_dict': {
|
||||
'id': '2790feb0-8120-4e63-9af3-c943c69f5e6c',
|
||||
|
@ -17,12 +17,54 @@ class PeriscopeBaseIE(InfoExtractor):
|
||||
'https://api.periscope.tv/api/v2/%s' % method,
|
||||
item_id, query=query)
|
||||
|
||||
def _parse_broadcast_data(self, broadcast, video_id):
|
||||
title = broadcast['status']
|
||||
uploader = broadcast.get('user_display_name') or broadcast.get('username')
|
||||
title = '%s - %s' % (uploader, title) if uploader else title
|
||||
is_live = broadcast.get('state').lower() == 'running'
|
||||
|
||||
thumbnails = [{
|
||||
'url': broadcast[image],
|
||||
} for image in ('image_url', 'image_url_small') if broadcast.get(image)]
|
||||
|
||||
return {
|
||||
'id': broadcast.get('id') or video_id,
|
||||
'title': self._live_title(title) if is_live else title,
|
||||
'timestamp': parse_iso8601(broadcast.get('created_at')),
|
||||
'uploader': uploader,
|
||||
'uploader_id': broadcast.get('user_id') or broadcast.get('username'),
|
||||
'thumbnails': thumbnails,
|
||||
'view_count': int_or_none(broadcast.get('total_watched')),
|
||||
'tags': broadcast.get('tags'),
|
||||
'is_live': is_live,
|
||||
}
|
||||
|
||||
@staticmethod
|
||||
def _extract_common_format_info(broadcast):
|
||||
return broadcast.get('state').lower(), int_or_none(broadcast.get('width')), int_or_none(broadcast.get('height'))
|
||||
|
||||
@staticmethod
|
||||
def _add_width_and_height(f, width, height):
|
||||
for key, val in (('width', width), ('height', height)):
|
||||
if not f.get(key):
|
||||
f[key] = val
|
||||
|
||||
def _extract_pscp_m3u8_formats(self, m3u8_url, video_id, format_id, state, width, height, fatal=True):
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4',
|
||||
entry_protocol='m3u8_native'
|
||||
if state in ('ended', 'timed_out') else 'm3u8',
|
||||
m3u8_id=format_id, fatal=fatal)
|
||||
if len(m3u8_formats) == 1:
|
||||
self._add_width_and_height(m3u8_formats[0], width, height)
|
||||
return m3u8_formats
|
||||
|
||||
|
||||
class PeriscopeIE(PeriscopeBaseIE):
|
||||
IE_DESC = 'Periscope'
|
||||
IE_NAME = 'periscope'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:periscope|pscp)\.tv/[^/]+/(?P<id>[^/?#]+)'
|
||||
# Alive example URLs can be found here http://onperiscope.com/
|
||||
# Alive example URLs can be found here https://www.periscope.tv/
|
||||
_TESTS = [{
|
||||
'url': 'https://www.periscope.tv/w/aJUQnjY3MjA3ODF8NTYxMDIyMDl2zCg2pECBgwTqRpQuQD352EMPTKQjT4uqlM3cgWFA-g==',
|
||||
'md5': '65b57957972e503fcbbaeed8f4fa04ca',
|
||||
@ -61,21 +103,9 @@ class PeriscopeIE(PeriscopeBaseIE):
|
||||
'accessVideoPublic', {'broadcast_id': token}, token)
|
||||
|
||||
broadcast = stream['broadcast']
|
||||
title = broadcast['status']
|
||||
info = self._parse_broadcast_data(broadcast, token)
|
||||
|
||||
uploader = broadcast.get('user_display_name') or broadcast.get('username')
|
||||
uploader_id = (broadcast.get('user_id') or broadcast.get('username'))
|
||||
|
||||
title = '%s - %s' % (uploader, title) if uploader else title
|
||||
state = broadcast.get('state').lower()
|
||||
if state == 'running':
|
||||
title = self._live_title(title)
|
||||
timestamp = parse_iso8601(broadcast.get('created_at'))
|
||||
|
||||
thumbnails = [{
|
||||
'url': broadcast[image],
|
||||
} for image in ('image_url', 'image_url_small') if broadcast.get(image)]
|
||||
|
||||
width = int_or_none(broadcast.get('width'))
|
||||
height = int_or_none(broadcast.get('height'))
|
||||
|
||||
@ -92,32 +122,20 @@ class PeriscopeIE(PeriscopeBaseIE):
|
||||
continue
|
||||
video_urls.add(video_url)
|
||||
if format_id != 'rtmp':
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
video_url, token, 'mp4',
|
||||
entry_protocol='m3u8_native'
|
||||
if state in ('ended', 'timed_out') else 'm3u8',
|
||||
m3u8_id=format_id, fatal=False)
|
||||
if len(m3u8_formats) == 1:
|
||||
add_width_and_height(m3u8_formats[0])
|
||||
m3u8_formats = self._extract_pscp_m3u8_formats(
|
||||
video_url, token, format_id, state, width, height, False)
|
||||
formats.extend(m3u8_formats)
|
||||
continue
|
||||
rtmp_format = {
|
||||
'url': video_url,
|
||||
'ext': 'flv' if format_id == 'rtmp' else 'mp4',
|
||||
}
|
||||
add_width_and_height(rtmp_format)
|
||||
self._add_width_and_height(rtmp_format)
|
||||
formats.append(rtmp_format)
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': broadcast.get('id') or token,
|
||||
'title': title,
|
||||
'timestamp': timestamp,
|
||||
'uploader': uploader,
|
||||
'uploader_id': uploader_id,
|
||||
'thumbnails': thumbnails,
|
||||
'formats': formats,
|
||||
}
|
||||
info['formats'] = formats
|
||||
return info
|
||||
|
||||
|
||||
class PeriscopeUserIE(PeriscopeBaseIE):
|
||||
|
@ -15,7 +15,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class PikselIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://player\.piksel\.com/v/(?P<id>[a-z0-9]+)'
|
||||
_VALID_URL = r'https?://player\.piksel\.com/v/(?:refid/[^/]+/prefid/)?(?P<id>[a-z0-9_]+)'
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://player.piksel.com/v/ums2867l',
|
||||
@ -40,6 +40,11 @@ class PikselIE(InfoExtractor):
|
||||
'timestamp': 1486171129,
|
||||
'upload_date': '20170204'
|
||||
}
|
||||
},
|
||||
{
|
||||
# https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2019240/
|
||||
'url': 'http://player.piksel.com/v/refid/nhkworld/prefid/nw_vod_v_en_2019_240_20190823233000_02_1566873477',
|
||||
'only_matching': True,
|
||||
}
|
||||
]
|
||||
|
||||
@ -52,8 +57,11 @@ class PikselIE(InfoExtractor):
|
||||
return mobj.group('url')
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
r'data-de-program-uuid=[\'"]([a-z0-9]+)',
|
||||
webpage, 'program uuid', default=display_id)
|
||||
app_token = self._search_regex([
|
||||
r'clientAPI\s*:\s*"([^"]+)"',
|
||||
r'data-de-api-key\s*=\s*"([^"]+)"'
|
||||
|
@ -403,6 +403,15 @@ class PornHubUserIE(PornHubPlaylistBaseIE):
|
||||
|
||||
|
||||
class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
|
||||
@staticmethod
|
||||
def _has_more(webpage):
|
||||
return re.search(
|
||||
r'''(?x)
|
||||
<li[^>]+\bclass=["\']page_next|
|
||||
<link[^>]+\brel=["\']next|
|
||||
<button[^>]+\bid=["\']moreDataBtn
|
||||
''', webpage) is not None
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
host = mobj.group('host')
|
||||
@ -411,13 +420,11 @@ class PornHubPagedPlaylistBaseIE(PornHubPlaylistBaseIE):
|
||||
page = int_or_none(self._search_regex(
|
||||
r'\bpage=(\d+)', url, 'page', default=None))
|
||||
|
||||
page_url = self._make_page_url(url)
|
||||
|
||||
entries = []
|
||||
for page_num in (page, ) if page is not None else itertools.count(1):
|
||||
try:
|
||||
webpage = self._download_webpage(
|
||||
page_url, item_id, 'Downloading page %d' % page_num,
|
||||
url, item_id, 'Downloading page %d' % page_num,
|
||||
query={'page': page_num})
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
|
||||
@ -547,18 +554,6 @@ class PornHubPagedVideoListIE(PornHubPagedPlaylistBaseIE):
|
||||
if PornHubIE.suitable(url) or PornHubUserIE.suitable(url) or PornHubUserVideosUploadIE.suitable(url)
|
||||
else super(PornHubPagedVideoListIE, cls).suitable(url))
|
||||
|
||||
def _make_page_url(self, url):
|
||||
return url
|
||||
|
||||
@staticmethod
|
||||
def _has_more(webpage):
|
||||
return re.search(
|
||||
r'''(?x)
|
||||
<li[^>]+\bclass=["\']page_next|
|
||||
<link[^>]+\brel=["\']next|
|
||||
<button[^>]+\bid=["\']moreDataBtn
|
||||
''', webpage) is not None
|
||||
|
||||
|
||||
class PornHubUserVideosUploadIE(PornHubPagedPlaylistBaseIE):
|
||||
_VALID_URL = r'(?P<url>https?://(?:[^/]+\.)?(?P<host>pornhub\.(?:com|net))/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/]+)/videos/upload)'
|
||||
@ -572,11 +567,3 @@ class PornHubUserVideosUploadIE(PornHubPagedPlaylistBaseIE):
|
||||
'url': 'https://www.pornhub.com/model/zoe_ph/videos/upload',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _make_page_url(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
return '%s/ajax' % mobj.group('url')
|
||||
|
||||
@staticmethod
|
||||
def _has_more(webpage):
|
||||
return True
|
||||
|
@ -1,70 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class PromptFileIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?promptfile\.com/l/(?P<id>[0-9A-Z\-]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.promptfile.com/l/86D1CE8462-576CAAE416',
|
||||
'md5': '5a7e285a26e0d66d9a263fae91bc92ce',
|
||||
'info_dict': {
|
||||
'id': '86D1CE8462-576CAAE416',
|
||||
'ext': 'mp4',
|
||||
'title': 'oceans.mp4',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
if re.search(r'<div.+id="not_found_msg".+>(?!We are).+</div>[^-]', webpage) is not None:
|
||||
raise ExtractorError('Video %s does not exist' % video_id,
|
||||
expected=True)
|
||||
|
||||
chash = self._search_regex(
|
||||
r'val\("([^"]*)"\s*\+\s*\$\("#chash"\)', webpage, 'chash')
|
||||
fields = self._hidden_inputs(webpage)
|
||||
keys = list(fields.keys())
|
||||
chash_key = keys[0] if len(keys) == 1 else next(
|
||||
key for key in keys if key.startswith('cha'))
|
||||
fields[chash_key] = chash + fields[chash_key]
|
||||
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, 'Downloading video page',
|
||||
data=urlencode_postdata(fields),
|
||||
headers={'Content-type': 'application/x-www-form-urlencoded'})
|
||||
|
||||
video_url = self._search_regex(
|
||||
(r'<a[^>]+href=(["\'])(?P<url>(?:(?!\1).)+)\1[^>]*>\s*Download File',
|
||||
r'<a[^>]+href=(["\'])(?P<url>https?://(?:www\.)?promptfile\.com/file/(?:(?!\1).)+)\1'),
|
||||
webpage, 'video url', group='url')
|
||||
title = self._html_search_regex(
|
||||
r'<span.+title="([^"]+)">', webpage, 'title')
|
||||
thumbnail = self._html_search_regex(
|
||||
r'<div id="player_overlay">.*button>.*?<img src="([^"]+)"',
|
||||
webpage, 'thumbnail', fatal=False, flags=re.DOTALL)
|
||||
|
||||
formats = [{
|
||||
'format_id': 'sd',
|
||||
'url': video_url,
|
||||
'ext': determine_ext(title),
|
||||
}]
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
'formats': formats,
|
||||
}
|
@ -25,21 +25,21 @@ class PuhuTVIE(InfoExtractor):
|
||||
_TESTS = [{
|
||||
# film
|
||||
'url': 'https://puhutv.com/sut-kardesler-izle',
|
||||
'md5': 'fbd8f2d8e7681f8bcd51b592475a6ae7',
|
||||
'md5': 'a347470371d56e1585d1b2c8dab01c96',
|
||||
'info_dict': {
|
||||
'id': '5085',
|
||||
'display_id': 'sut-kardesler',
|
||||
'ext': 'mp4',
|
||||
'title': 'Süt Kardeşler',
|
||||
'description': 'md5:405fd024df916ca16731114eb18e511a',
|
||||
'description': 'md5:ca09da25b7e57cbb5a9280d6e48d17aa',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 4832.44,
|
||||
'creator': 'Arzu Film',
|
||||
'timestamp': 1469778212,
|
||||
'upload_date': '20160729',
|
||||
'timestamp': 1561062602,
|
||||
'upload_date': '20190620',
|
||||
'release_year': 1976,
|
||||
'view_count': int,
|
||||
'tags': ['Aile', 'Komedi', 'Klasikler'],
|
||||
'tags': list,
|
||||
},
|
||||
}, {
|
||||
# episode, geo restricted, bypassable with --geo-verification-proxy
|
||||
@ -64,9 +64,10 @@ class PuhuTVIE(InfoExtractor):
|
||||
display_id)['data']
|
||||
|
||||
video_id = compat_str(info['id'])
|
||||
title = info.get('name') or info['title']['name']
|
||||
show = info.get('title') or {}
|
||||
title = info.get('name') or show['name']
|
||||
if info.get('display_name'):
|
||||
title = '%s %s' % (title, info.get('display_name'))
|
||||
title = '%s %s' % (title, info['display_name'])
|
||||
|
||||
try:
|
||||
videos = self._download_json(
|
||||
@ -78,17 +79,36 @@ class PuhuTVIE(InfoExtractor):
|
||||
self.raise_geo_restricted()
|
||||
raise
|
||||
|
||||
urls = []
|
||||
formats = []
|
||||
|
||||
def add_http_from_hls(m3u8_f):
|
||||
http_url = m3u8_f['url'].replace('/hls/', '/mp4/').replace('/chunklist.m3u8', '.mp4')
|
||||
if http_url != m3u8_f['url']:
|
||||
f = m3u8_f.copy()
|
||||
f.update({
|
||||
'format_id': f['format_id'].replace('hls', 'http'),
|
||||
'protocol': 'http',
|
||||
'url': http_url,
|
||||
})
|
||||
formats.append(f)
|
||||
|
||||
for video in videos['data']['videos']:
|
||||
media_url = url_or_none(video.get('url'))
|
||||
if not media_url:
|
||||
if not media_url or media_url in urls:
|
||||
continue
|
||||
urls.append(media_url)
|
||||
|
||||
playlist = video.get('is_playlist')
|
||||
if video.get('stream_type') == 'hls' and playlist is True:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
if (video.get('stream_type') == 'hls' and playlist is True) or 'playlist.m3u8' in media_url:
|
||||
m3u8_formats = self._extract_m3u8_formats(
|
||||
media_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
m3u8_id='hls', fatal=False)
|
||||
for m3u8_f in m3u8_formats:
|
||||
formats.append(m3u8_f)
|
||||
add_http_from_hls(m3u8_f)
|
||||
continue
|
||||
|
||||
quality = int_or_none(video.get('quality'))
|
||||
f = {
|
||||
'url': media_url,
|
||||
@ -96,34 +116,29 @@ class PuhuTVIE(InfoExtractor):
|
||||
'height': quality
|
||||
}
|
||||
video_format = video.get('video_format')
|
||||
if video_format == 'hls' and playlist is False:
|
||||
is_hls = (video_format == 'hls' or '/hls/' in media_url or '/chunklist.m3u8' in media_url) and playlist is False
|
||||
if is_hls:
|
||||
format_id = 'hls'
|
||||
f['protocol'] = 'm3u8_native'
|
||||
elif video_format == 'mp4':
|
||||
format_id = 'http'
|
||||
|
||||
else:
|
||||
continue
|
||||
if quality:
|
||||
format_id += '-%sp' % quality
|
||||
f['format_id'] = format_id
|
||||
formats.append(f)
|
||||
if is_hls:
|
||||
add_http_from_hls(f)
|
||||
self._sort_formats(formats)
|
||||
|
||||
description = try_get(
|
||||
info, lambda x: x['title']['description'],
|
||||
compat_str) or info.get('description')
|
||||
timestamp = unified_timestamp(info.get('created_at'))
|
||||
creator = try_get(
|
||||
info, lambda x: x['title']['producer']['name'], compat_str)
|
||||
show, lambda x: x['producer']['name'], compat_str)
|
||||
|
||||
duration = float_or_none(
|
||||
try_get(info, lambda x: x['content']['duration_in_ms'], int),
|
||||
scale=1000)
|
||||
view_count = try_get(info, lambda x: x['content']['watch_count'], int)
|
||||
content = info.get('content') or {}
|
||||
|
||||
images = try_get(
|
||||
info, lambda x: x['content']['images']['wide'], dict) or {}
|
||||
content, lambda x: x['images']['wide'], dict) or {}
|
||||
thumbnails = []
|
||||
for image_id, image_url in images.items():
|
||||
if not isinstance(image_url, compat_str):
|
||||
@ -137,14 +152,8 @@ class PuhuTVIE(InfoExtractor):
|
||||
})
|
||||
thumbnails.append(t)
|
||||
|
||||
release_year = try_get(info, lambda x: x['title']['released_at'], int)
|
||||
|
||||
season_number = int_or_none(info.get('season_number'))
|
||||
season_id = str_or_none(info.get('season_id'))
|
||||
episode_number = int_or_none(info.get('episode_number'))
|
||||
|
||||
tags = []
|
||||
for genre in try_get(info, lambda x: x['title']['genres'], list) or []:
|
||||
for genre in show.get('genres') or []:
|
||||
if not isinstance(genre, dict):
|
||||
continue
|
||||
genre_name = genre.get('name')
|
||||
@ -152,12 +161,11 @@ class PuhuTVIE(InfoExtractor):
|
||||
tags.append(genre_name)
|
||||
|
||||
subtitles = {}
|
||||
for subtitle in try_get(
|
||||
info, lambda x: x['content']['subtitles'], list) or []:
|
||||
for subtitle in content.get('subtitles') or []:
|
||||
if not isinstance(subtitle, dict):
|
||||
continue
|
||||
lang = subtitle.get('language')
|
||||
sub_url = url_or_none(subtitle.get('url'))
|
||||
sub_url = url_or_none(subtitle.get('url') or subtitle.get('file'))
|
||||
if not lang or not isinstance(lang, compat_str) or not sub_url:
|
||||
continue
|
||||
subtitles[self._SUBTITLE_LANGS.get(lang, lang)] = [{
|
||||
@ -168,15 +176,15 @@ class PuhuTVIE(InfoExtractor):
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'season_id': season_id,
|
||||
'season_number': season_number,
|
||||
'episode_number': episode_number,
|
||||
'release_year': release_year,
|
||||
'timestamp': timestamp,
|
||||
'description': info.get('description') or show.get('description'),
|
||||
'season_id': str_or_none(info.get('season_id')),
|
||||
'season_number': int_or_none(info.get('season_number')),
|
||||
'episode_number': int_or_none(info.get('episode_number')),
|
||||
'release_year': int_or_none(show.get('released_at')),
|
||||
'timestamp': unified_timestamp(info.get('created_at')),
|
||||
'creator': creator,
|
||||
'view_count': view_count,
|
||||
'duration': duration,
|
||||
'view_count': int_or_none(content.get('watch_count')),
|
||||
'duration': float_or_none(content.get('duration_in_ms'), 1000),
|
||||
'tags': tags,
|
||||
'subtitles': subtitles,
|
||||
'thumbnails': thumbnails,
|
||||
|
@ -6,6 +6,7 @@ from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
merge_dicts,
|
||||
str_to_int,
|
||||
unified_strdate,
|
||||
url_or_none,
|
||||
@ -45,11 +46,14 @@ class RedTubeIE(InfoExtractor):
|
||||
if any(s in webpage for s in ['video-deleted-info', '>This video has been removed']):
|
||||
raise ExtractorError('Video %s has been removed' % video_id, expected=True)
|
||||
|
||||
title = self._html_search_regex(
|
||||
(r'<h(\d)[^>]+class="(?:video_title_text|videoTitle)[^"]*">(?P<title>(?:(?!\1).)+)</h\1>',
|
||||
r'(?:videoTitle|title)\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',),
|
||||
webpage, 'title', group='title',
|
||||
default=None) or self._og_search_title(webpage)
|
||||
info = self._search_json_ld(webpage, video_id, default={})
|
||||
|
||||
if not info.get('title'):
|
||||
info['title'] = self._html_search_regex(
|
||||
(r'<h(\d)[^>]+class="(?:video_title_text|videoTitle)[^"]*">(?P<title>(?:(?!\1).)+)</h\1>',
|
||||
r'(?:videoTitle|title)\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',),
|
||||
webpage, 'title', group='title',
|
||||
default=None) or self._og_search_title(webpage)
|
||||
|
||||
formats = []
|
||||
sources = self._parse_json(
|
||||
@ -88,28 +92,28 @@ class RedTubeIE(InfoExtractor):
|
||||
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
upload_date = unified_strdate(self._search_regex(
|
||||
r'<span[^>]+>ADDED ([^<]+)<',
|
||||
webpage, 'upload date', fatal=False))
|
||||
r'<span[^>]+>(?:ADDED|Published on) ([^<]+)<',
|
||||
webpage, 'upload date', default=None))
|
||||
duration = int_or_none(self._og_search_property(
|
||||
'video:duration', webpage, default=None) or self._search_regex(
|
||||
r'videoDuration\s*:\s*(\d+)', webpage, 'duration', default=None))
|
||||
view_count = str_to_int(self._search_regex(
|
||||
(r'<div[^>]*>Views</div>\s*<div[^>]*>\s*([\d,.]+)',
|
||||
r'<span[^>]*>VIEWS</span>\s*</td>\s*<td>\s*([\d,.]+)'),
|
||||
webpage, 'view count', fatal=False))
|
||||
r'<span[^>]*>VIEWS</span>\s*</td>\s*<td>\s*([\d,.]+)',
|
||||
r'<span[^>]+\bclass=["\']video_view_count[^>]*>\s*([\d,.]+)'),
|
||||
webpage, 'view count', default=None))
|
||||
|
||||
# No self-labeling, but they describe themselves as
|
||||
# "Home of Videos Porno"
|
||||
age_limit = 18
|
||||
|
||||
return {
|
||||
return merge_dicts(info, {
|
||||
'id': video_id,
|
||||
'ext': 'mp4',
|
||||
'title': title,
|
||||
'thumbnail': thumbnail,
|
||||
'upload_date': upload_date,
|
||||
'duration': duration,
|
||||
'view_count': view_count,
|
||||
'age_limit': age_limit,
|
||||
'formats': formats,
|
||||
}
|
||||
})
|
||||
|
@ -1,170 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
unescapeHTML,
|
||||
qualities,
|
||||
)
|
||||
|
||||
|
||||
class Revision3EmbedIE(InfoExtractor):
|
||||
IE_NAME = 'revision3:embed'
|
||||
_VALID_URL = r'(?:revision3:(?:(?P<playlist_type>[^:]+):)?|https?://(?:(?:(?:www|embed)\.)?(?:revision3|animalist)|(?:(?:api|embed)\.)?seekernetwork)\.com/player/embed\?videoId=)(?P<playlist_id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://api.seekernetwork.com/player/embed?videoId=67558',
|
||||
'md5': '83bcd157cab89ad7318dd7b8c9cf1306',
|
||||
'info_dict': {
|
||||
'id': '67558',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Pros & Cons Of Zoos',
|
||||
'description': 'Zoos are often depicted as a terrible place for animals to live, but is there any truth to this?',
|
||||
'uploader_id': 'dnews',
|
||||
'uploader': 'DNews',
|
||||
}
|
||||
}
|
||||
_API_KEY = 'ba9c741bce1b9d8e3defcc22193f3651b8867e62'
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
playlist_id = mobj.group('playlist_id')
|
||||
playlist_type = mobj.group('playlist_type') or 'video_id'
|
||||
video_data = self._download_json(
|
||||
'http://revision3.com/api/getPlaylist.json', playlist_id, query={
|
||||
'api_key': self._API_KEY,
|
||||
'codecs': 'h264,vp8,theora',
|
||||
playlist_type: playlist_id,
|
||||
})['items'][0]
|
||||
|
||||
formats = []
|
||||
for vcodec, media in video_data['media'].items():
|
||||
for quality_id, quality in media.items():
|
||||
if quality_id == 'hls':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
quality['url'], playlist_id, 'mp4',
|
||||
'm3u8_native', m3u8_id='hls', fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'url': quality['url'],
|
||||
'format_id': '%s-%s' % (vcodec, quality_id),
|
||||
'tbr': int_or_none(quality.get('bitrate')),
|
||||
'vcodec': vcodec,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': playlist_id,
|
||||
'title': unescapeHTML(video_data['title']),
|
||||
'description': unescapeHTML(video_data.get('summary')),
|
||||
'uploader': video_data.get('show', {}).get('name'),
|
||||
'uploader_id': video_data.get('show', {}).get('slug'),
|
||||
'duration': int_or_none(video_data.get('duration')),
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class Revision3IE(InfoExtractor):
|
||||
IE_NAME = 'revision'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:revision3|animalist)\.com)/(?P<id>[^/]+(?:/[^/?#]+)?)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.revision3.com/technobuffalo/5-google-predictions-for-2016',
|
||||
'md5': 'd94a72d85d0a829766de4deb8daaf7df',
|
||||
'info_dict': {
|
||||
'id': '71089',
|
||||
'display_id': 'technobuffalo/5-google-predictions-for-2016',
|
||||
'ext': 'webm',
|
||||
'title': '5 Google Predictions for 2016',
|
||||
'description': 'Google had a great 2015, but it\'s already time to look ahead. Here are our five predictions for 2016.',
|
||||
'upload_date': '20151228',
|
||||
'timestamp': 1451325600,
|
||||
'duration': 187,
|
||||
'uploader': 'TechnoBuffalo',
|
||||
'uploader_id': 'technobuffalo',
|
||||
}
|
||||
}, {
|
||||
# Show
|
||||
'url': 'http://revision3.com/variant',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# Tag
|
||||
'url': 'http://revision3.com/vr',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_PAGE_DATA_TEMPLATE = 'http://www.%s/apiProxy/ddn/%s?domain=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
domain, display_id = re.match(self._VALID_URL, url).groups()
|
||||
site = domain.split('.')[0]
|
||||
page_info = self._download_json(
|
||||
self._PAGE_DATA_TEMPLATE % (domain, display_id, domain), display_id)
|
||||
|
||||
page_data = page_info['data']
|
||||
page_type = page_data['type']
|
||||
if page_type in ('episode', 'embed'):
|
||||
show_data = page_data['show']['data']
|
||||
page_id = compat_str(page_data['id'])
|
||||
video_id = compat_str(page_data['video']['data']['id'])
|
||||
|
||||
preference = qualities(['mini', 'small', 'medium', 'large'])
|
||||
thumbnails = [{
|
||||
'url': image_url,
|
||||
'id': image_id,
|
||||
'preference': preference(image_id)
|
||||
} for image_id, image_url in page_data.get('images', {}).items()]
|
||||
|
||||
info = {
|
||||
'id': page_id,
|
||||
'display_id': display_id,
|
||||
'title': unescapeHTML(page_data['name']),
|
||||
'description': unescapeHTML(page_data.get('summary')),
|
||||
'timestamp': parse_iso8601(page_data.get('publishTime'), ' '),
|
||||
'author': page_data.get('author'),
|
||||
'uploader': show_data.get('name'),
|
||||
'uploader_id': show_data.get('slug'),
|
||||
'thumbnails': thumbnails,
|
||||
'extractor_key': site,
|
||||
}
|
||||
|
||||
if page_type == 'embed':
|
||||
info.update({
|
||||
'_type': 'url_transparent',
|
||||
'url': page_data['video']['data']['embed'],
|
||||
})
|
||||
return info
|
||||
|
||||
info.update({
|
||||
'_type': 'url_transparent',
|
||||
'url': 'revision3:%s' % video_id,
|
||||
})
|
||||
return info
|
||||
else:
|
||||
list_data = page_info[page_type]['data']
|
||||
episodes_data = page_info['episodes']['data']
|
||||
num_episodes = page_info['meta']['totalEpisodes']
|
||||
processed_episodes = 0
|
||||
entries = []
|
||||
page_num = 1
|
||||
while True:
|
||||
entries.extend([{
|
||||
'_type': 'url',
|
||||
'url': 'http://%s%s' % (domain, episode['path']),
|
||||
'id': compat_str(episode['id']),
|
||||
'ie_key': 'Revision3',
|
||||
'extractor_key': site,
|
||||
} for episode in episodes_data])
|
||||
processed_episodes += len(episodes_data)
|
||||
if processed_episodes == num_episodes:
|
||||
break
|
||||
page_num += 1
|
||||
episodes_data = self._download_json(self._PAGE_DATA_TEMPLATE % (
|
||||
domain, display_id + '/' + compat_str(page_num), domain),
|
||||
display_id)['episodes']['data']
|
||||
|
||||
return self.playlist_result(
|
||||
entries, compat_str(list_data['id']),
|
||||
list_data.get('name'), list_data.get('summary'))
|
@ -1,8 +1,6 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_HTTPError,
|
||||
@ -18,7 +16,6 @@ from ..utils import (
|
||||
|
||||
class RoosterTeethIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:.+?\.)?roosterteeth\.com/(?:episode|watch)/(?P<id>[^/?#&]+)'
|
||||
_LOGIN_URL = 'https://roosterteeth.com/login'
|
||||
_NETRC_MACHINE = 'roosterteeth'
|
||||
_TESTS = [{
|
||||
'url': 'http://roosterteeth.com/episode/million-dollars-but-season-2-million-dollars-but-the-game-announcement',
|
||||
@ -53,48 +50,40 @@ class RoosterTeethIE(InfoExtractor):
|
||||
'url': 'https://roosterteeth.com/watch/million-dollars-but-season-2-million-dollars-but-the-game-announcement',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_EPISODE_BASE_URL = 'https://svod-be.roosterteeth.com/api/v1/episodes/'
|
||||
|
||||
def _login(self):
|
||||
username, password = self._get_login_info()
|
||||
if username is None:
|
||||
return
|
||||
|
||||
login_page = self._download_webpage(
|
||||
self._LOGIN_URL, None,
|
||||
note='Downloading login page',
|
||||
errnote='Unable to download login page')
|
||||
|
||||
login_form = self._hidden_inputs(login_page)
|
||||
|
||||
login_form.update({
|
||||
'username': username,
|
||||
'password': password,
|
||||
})
|
||||
|
||||
login_request = self._download_webpage(
|
||||
self._LOGIN_URL, None,
|
||||
note='Logging in',
|
||||
data=urlencode_postdata(login_form),
|
||||
headers={
|
||||
'Referer': self._LOGIN_URL,
|
||||
})
|
||||
|
||||
if not any(re.search(p, login_request) for p in (
|
||||
r'href=["\']https?://(?:www\.)?roosterteeth\.com/logout"',
|
||||
r'>Sign Out<')):
|
||||
error = self._html_search_regex(
|
||||
r'(?s)<div[^>]+class=(["\']).*?\balert-danger\b.*?\1[^>]*>(?:\s*<button[^>]*>.*?</button>)?(?P<error>.+?)</div>',
|
||||
login_request, 'alert', default=None, group='error')
|
||||
if error:
|
||||
raise ExtractorError('Unable to login: %s' % error, expected=True)
|
||||
raise ExtractorError('Unable to log in')
|
||||
try:
|
||||
self._download_json(
|
||||
'https://auth.roosterteeth.com/oauth/token',
|
||||
None, 'Logging in', data=urlencode_postdata({
|
||||
'client_id': '4338d2b4bdc8db1239360f28e72f0d9ddb1fd01e7a38fbb07b4b1f4ba4564cc5',
|
||||
'grant_type': 'password',
|
||||
'username': username,
|
||||
'password': password,
|
||||
}))
|
||||
except ExtractorError as e:
|
||||
msg = 'Unable to login'
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
|
||||
resp = self._parse_json(e.cause.read().decode(), None, fatal=False)
|
||||
if resp:
|
||||
error = resp.get('extra_info') or resp.get('error_description') or resp.get('error')
|
||||
if error:
|
||||
msg += ': ' + error
|
||||
self.report_warning(msg)
|
||||
|
||||
def _real_initialize(self):
|
||||
if self._get_cookies(self._EPISODE_BASE_URL).get('rt_access_token'):
|
||||
return
|
||||
self._login()
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
api_episode_url = 'https://svod-be.roosterteeth.com/api/v1/episodes/%s' % display_id
|
||||
api_episode_url = self._EPISODE_BASE_URL + display_id
|
||||
|
||||
try:
|
||||
m3u8_url = self._download_json(
|
||||
|
144
youtube_dl/extractor/scte.py
Normal file
144
youtube_dl/extractor/scte.py
Normal file
@ -0,0 +1,144 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
decode_packed_codes,
|
||||
ExtractorError,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class SCTEBaseIE(InfoExtractor):
|
||||
_LOGIN_URL = 'https://www.scte.org/SCTE/Sign_In.aspx'
|
||||
_NETRC_MACHINE = 'scte'
|
||||
|
||||
def _real_initialize(self):
|
||||
self._login()
|
||||
|
||||
def _login(self):
|
||||
username, password = self._get_login_info()
|
||||
if username is None:
|
||||
return
|
||||
|
||||
login_popup = self._download_webpage(
|
||||
self._LOGIN_URL, None, 'Downloading login popup')
|
||||
|
||||
def is_logged(webpage):
|
||||
return any(re.search(p, webpage) for p in (
|
||||
r'class=["\']welcome\b', r'>Sign Out<'))
|
||||
|
||||
# already logged in
|
||||
if is_logged(login_popup):
|
||||
return
|
||||
|
||||
login_form = self._hidden_inputs(login_popup)
|
||||
|
||||
login_form.update({
|
||||
'ctl01$TemplateBody$WebPartManager1$gwpciNewContactSignInCommon$ciNewContactSignInCommon$signInUserName': username,
|
||||
'ctl01$TemplateBody$WebPartManager1$gwpciNewContactSignInCommon$ciNewContactSignInCommon$signInPassword': password,
|
||||
'ctl01$TemplateBody$WebPartManager1$gwpciNewContactSignInCommon$ciNewContactSignInCommon$RememberMe': 'on',
|
||||
})
|
||||
|
||||
response = self._download_webpage(
|
||||
self._LOGIN_URL, None, 'Logging in',
|
||||
data=urlencode_postdata(login_form))
|
||||
|
||||
if '|pageRedirect|' not in response and not is_logged(response):
|
||||
error = self._html_search_regex(
|
||||
r'(?s)<[^>]+class=["\']AsiError["\'][^>]*>(.+?)</',
|
||||
response, 'error message', default=None)
|
||||
if error:
|
||||
raise ExtractorError('Unable to login: %s' % error, expected=True)
|
||||
raise ExtractorError('Unable to log in')
|
||||
|
||||
|
||||
class SCTEIE(SCTEBaseIE):
|
||||
_VALID_URL = r'https?://learning\.scte\.org/mod/scorm/view\.php?.*?\bid=(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://learning.scte.org/mod/scorm/view.php?id=31484',
|
||||
'info_dict': {
|
||||
'title': 'Introduction to DOCSIS Engineering Professional',
|
||||
'id': '31484',
|
||||
},
|
||||
'playlist_count': 5,
|
||||
'skip': 'Requires account credentials',
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
title = self._search_regex(r'<h1>(.+?)</h1>', webpage, 'title')
|
||||
|
||||
context_id = self._search_regex(r'context-(\d+)', webpage, video_id)
|
||||
content_base = 'https://learning.scte.org/pluginfile.php/%s/mod_scorm/content/8/' % context_id
|
||||
context = decode_packed_codes(self._download_webpage(
|
||||
'%smobile/data.js' % content_base, video_id))
|
||||
|
||||
data = self._parse_xml(
|
||||
self._search_regex(
|
||||
r'CreateData\(\s*"(.+?)"', context, 'data').replace(r"\'", "'"),
|
||||
video_id)
|
||||
|
||||
entries = []
|
||||
for asset in data.findall('.//asset'):
|
||||
asset_url = asset.get('url')
|
||||
if not asset_url or not asset_url.endswith('.mp4'):
|
||||
continue
|
||||
asset_id = self._search_regex(
|
||||
r'video_([^_]+)_', asset_url, 'asset id', default=None)
|
||||
if not asset_id:
|
||||
continue
|
||||
entries.append({
|
||||
'id': asset_id,
|
||||
'title': title,
|
||||
'url': content_base + asset_url,
|
||||
})
|
||||
|
||||
return self.playlist_result(entries, video_id, title)
|
||||
|
||||
|
||||
class SCTECourseIE(SCTEBaseIE):
|
||||
_VALID_URL = r'https?://learning\.scte\.org/(?:mod/sub)?course/view\.php?.*?\bid=(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://learning.scte.org/mod/subcourse/view.php?id=31491',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://learning.scte.org/course/view.php?id=3639',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://learning.scte.org/course/view.php?id=3073',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
course_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, course_id)
|
||||
|
||||
title = self._search_regex(
|
||||
r'<h1>(.+?)</h1>', webpage, 'title', default=None)
|
||||
|
||||
entries = []
|
||||
for mobj in re.finditer(
|
||||
r'''(?x)
|
||||
<a[^>]+
|
||||
href=(["\'])
|
||||
(?P<url>
|
||||
https?://learning\.scte\.org/mod/
|
||||
(?P<kind>scorm|subcourse)/view\.php?(?:(?!\1).)*?
|
||||
\bid=\d+
|
||||
)
|
||||
''',
|
||||
webpage):
|
||||
item_url = mobj.group('url')
|
||||
if item_url == url:
|
||||
continue
|
||||
ie = (SCTEIE.ie_key() if mobj.group('kind') == 'scorm'
|
||||
else SCTECourseIE.ie_key())
|
||||
entries.append(self.url_result(item_url, ie=ie))
|
||||
|
||||
return self.playlist_result(entries, course_id, title)
|
@ -4,34 +4,37 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
get_element_by_class,
|
||||
strip_or_none,
|
||||
)
|
||||
|
||||
|
||||
class SeekerIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?seeker\.com/(?P<display_id>.*)-(?P<article_id>\d+)\.html'
|
||||
_TESTS = [{
|
||||
# player.loadRevision3Item
|
||||
'url': 'http://www.seeker.com/should-trump-be-required-to-release-his-tax-returns-1833805621.html',
|
||||
'md5': '30c1dc4030cc715cf05b423d0947ac18',
|
||||
'md5': '897d44bbe0d8986a2ead96de565a92db',
|
||||
'info_dict': {
|
||||
'id': '76243',
|
||||
'ext': 'webm',
|
||||
'id': 'Elrn3gnY',
|
||||
'ext': 'mp4',
|
||||
'title': 'Should Trump Be Required To Release His Tax Returns?',
|
||||
'description': 'Donald Trump has been secretive about his "big," "beautiful" tax returns. So what can we learn if he decides to release them?',
|
||||
'uploader': 'Seeker Daily',
|
||||
'uploader_id': 'seekerdaily',
|
||||
'description': 'md5:41efa8cfa8d627841045eec7b018eb45',
|
||||
'timestamp': 1490090165,
|
||||
'upload_date': '20170321',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.seeker.com/changes-expected-at-zoos-following-recent-gorilla-lion-shootings-1834116536.html',
|
||||
'playlist': [
|
||||
{
|
||||
'md5': '83bcd157cab89ad7318dd7b8c9cf1306',
|
||||
'md5': '0497b9f20495174be73ae136949707d2',
|
||||
'info_dict': {
|
||||
'id': '67558',
|
||||
'id': 'FihYQ8AE',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Pros & Cons Of Zoos',
|
||||
'description': 'Zoos are often depicted as a terrible place for animals to live, but is there any truth to this?',
|
||||
'uploader': 'DNews',
|
||||
'uploader_id': 'dnews',
|
||||
'description': 'md5:d88f99a8ea8e7d25e6ff77f271b1271c',
|
||||
'timestamp': 1490039133,
|
||||
'upload_date': '20170320',
|
||||
},
|
||||
}
|
||||
],
|
||||
@ -45,13 +48,11 @@ class SeekerIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
display_id, article_id = re.match(self._VALID_URL, url).groups()
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
mobj = re.search(r"player\.loadRevision3Item\('([^']+)'\s*,\s*(\d+)\);", webpage)
|
||||
if mobj:
|
||||
playlist_type, playlist_id = mobj.groups()
|
||||
return self.url_result(
|
||||
'revision3:%s:%s' % (playlist_type, playlist_id), 'Revision3Embed', playlist_id)
|
||||
else:
|
||||
entries = [self.url_result('revision3:video_id:%s' % video_id, 'Revision3Embed', video_id) for video_id in re.findall(
|
||||
r'<iframe[^>]+src=[\'"](?:https?:)?//api\.seekernetwork\.com/player/embed\?videoId=(\d+)', webpage)]
|
||||
return self.playlist_result(
|
||||
entries, article_id, self._og_search_title(webpage), self._og_search_description(webpage))
|
||||
entries = []
|
||||
for jwp_id in re.findall(r'data-video-id="([a-zA-Z0-9]{8})"', webpage):
|
||||
entries.append(self.url_result(
|
||||
'jwplatform:' + jwp_id, 'JWPlatform', jwp_id))
|
||||
return self.playlist_result(
|
||||
entries, article_id,
|
||||
self._og_search_title(webpage),
|
||||
strip_or_none(get_element_by_class('subtitle__text', webpage)) or self._og_search_description(webpage))
|
||||
|
@ -1,72 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
class ServingSysIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:[^.]+\.)?serving-sys\.com/BurstingPipe/adServer\.bs\?.*?&pli=(?P<id>[0-9]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://bs.serving-sys.com/BurstingPipe/adServer.bs?cn=is&c=23&pl=VAST&pli=5349193&PluID=0&pos=7135&ord=[timestamp]&cim=1?',
|
||||
'info_dict': {
|
||||
'id': '5349193',
|
||||
'title': 'AdAPPter_Hyundai_demo',
|
||||
},
|
||||
'playlist': [{
|
||||
'md5': 'baed851342df6846eb8677a60a011a0f',
|
||||
'info_dict': {
|
||||
'id': '29955898',
|
||||
'ext': 'flv',
|
||||
'title': 'AdAPPter_Hyundai_demo (1)',
|
||||
'duration': 74,
|
||||
'tbr': 1378,
|
||||
'width': 640,
|
||||
'height': 400,
|
||||
},
|
||||
}, {
|
||||
'md5': '979b4da2655c4bc2d81aeb915a8c5014',
|
||||
'info_dict': {
|
||||
'id': '29907998',
|
||||
'ext': 'flv',
|
||||
'title': 'AdAPPter_Hyundai_demo (2)',
|
||||
'duration': 34,
|
||||
'width': 854,
|
||||
'height': 480,
|
||||
'tbr': 516,
|
||||
},
|
||||
}],
|
||||
'params': {
|
||||
'playlistend': 2,
|
||||
},
|
||||
'_skip': 'Blocked in the US [sic]',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
pl_id = self._match_id(url)
|
||||
vast_doc = self._download_xml(url, pl_id)
|
||||
|
||||
title = vast_doc.find('.//AdTitle').text
|
||||
media = vast_doc.find('.//MediaFile').text
|
||||
info_url = self._search_regex(r'&adData=([^&]+)&', media, 'info URL')
|
||||
|
||||
doc = self._download_xml(info_url, pl_id, 'Downloading video info')
|
||||
entries = [{
|
||||
'_type': 'video',
|
||||
'id': a.attrib['id'],
|
||||
'title': '%s (%s)' % (title, a.attrib['assetID']),
|
||||
'url': a.attrib['URL'],
|
||||
'duration': int_or_none(a.attrib.get('length')),
|
||||
'tbr': int_or_none(a.attrib.get('bitrate')),
|
||||
'height': int_or_none(a.attrib.get('height')),
|
||||
'width': int_or_none(a.attrib.get('width')),
|
||||
} for a in doc.findall('.//AdditionalAssets/asset')]
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'id': pl_id,
|
||||
'title': title,
|
||||
'entries': entries,
|
||||
}
|
@ -11,14 +11,13 @@ from .common import (
|
||||
from ..compat import (
|
||||
compat_str,
|
||||
compat_urlparse,
|
||||
compat_urllib_parse_urlencode,
|
||||
)
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
HEADRequest,
|
||||
int_or_none,
|
||||
KNOWN_EXTENSIONS,
|
||||
merge_dicts,
|
||||
mimetype2ext,
|
||||
str_or_none,
|
||||
try_get,
|
||||
@ -28,6 +27,20 @@ from ..utils import (
|
||||
)
|
||||
|
||||
|
||||
class SoundcloudEmbedIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:w|player|p)\.soundcloud\.com/player/?.*?url=(?P<id>.*)'
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
return [m.group('url') for m in re.finditer(
|
||||
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:w\.)?soundcloud\.com/player.+?)\1',
|
||||
webpage)]
|
||||
|
||||
def _real_extract(self, url):
|
||||
return self.url_result(compat_urlparse.parse_qs(
|
||||
compat_urlparse.urlparse(url).query)['url'][0])
|
||||
|
||||
|
||||
class SoundcloudIE(InfoExtractor):
|
||||
"""Information extractor for soundcloud.com
|
||||
To access the media, the uid of the song and a stream token
|
||||
@ -44,9 +57,8 @@ class SoundcloudIE(InfoExtractor):
|
||||
(?!(?:tracks|albums|sets(?:/.+?)?|reposts|likes|spotlight)/?(?:$|[?#]))
|
||||
(?P<title>[\w\d-]+)/?
|
||||
(?P<token>[^?]+?)?(?:[?].*)?$)
|
||||
|(?:api\.soundcloud\.com/tracks/(?P<track_id>\d+)
|
||||
|(?:api(?:-v2)?\.soundcloud\.com/tracks/(?P<track_id>\d+)
|
||||
(?:/?\?secret_token=(?P<secret_token>[^&]+))?)
|
||||
|(?P<player>(?:w|player|p.)\.soundcloud\.com/player/?.*?url=.*)
|
||||
)
|
||||
'''
|
||||
IE_NAME = 'soundcloud'
|
||||
@ -60,6 +72,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Lostin Powers - She so Heavy (SneakPreview) Adrian Ackers Blueprint 1',
|
||||
'description': 'No Downloads untill we record the finished version this weekend, i was too pumped n i had to post it , earl is prolly gonna b hella p.o\'d',
|
||||
'uploader': 'E.T. ExTerrestrial Music',
|
||||
'uploader_id': '1571244',
|
||||
'timestamp': 1349920598,
|
||||
'upload_date': '20121011',
|
||||
'duration': 143.216,
|
||||
@ -79,6 +92,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Goldrushed',
|
||||
'description': 'From Stockholm Sweden\r\nPovel / Magnus / Filip / David\r\nwww.theroyalconcept.com',
|
||||
'uploader': 'The Royal Concept',
|
||||
'uploader_id': '9615865',
|
||||
'timestamp': 1337635207,
|
||||
'upload_date': '20120521',
|
||||
'duration': 30,
|
||||
@ -92,6 +106,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
# rtmp
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Preview',
|
||||
},
|
||||
# private link
|
||||
{
|
||||
@ -103,6 +118,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Youtube - Dl Test Video \'\' Ä↭',
|
||||
'description': 'test chars: \"\'/\\ä↭',
|
||||
'uploader': 'jaimeMF',
|
||||
'uploader_id': '69767071',
|
||||
'timestamp': 1386604920,
|
||||
'upload_date': '20131209',
|
||||
'duration': 9.927,
|
||||
@ -123,6 +139,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Youtube - Dl Test Video \'\' Ä↭',
|
||||
'description': 'test chars: \"\'/\\ä↭',
|
||||
'uploader': 'jaimeMF',
|
||||
'uploader_id': '69767071',
|
||||
'timestamp': 1386604920,
|
||||
'upload_date': '20131209',
|
||||
'duration': 9.927,
|
||||
@ -143,6 +160,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Bus Brakes',
|
||||
'description': 'md5:0053ca6396e8d2fd7b7e1595ef12ab66',
|
||||
'uploader': 'oddsamples',
|
||||
'uploader_id': '73680509',
|
||||
'timestamp': 1389232924,
|
||||
'upload_date': '20140109',
|
||||
'duration': 17.346,
|
||||
@ -163,6 +181,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
|
||||
'description': 'md5:fa20ee0fca76a3d6df8c7e57f3715366',
|
||||
'uploader': 'Ori Uplift Music',
|
||||
'uploader_id': '12563093',
|
||||
'timestamp': 1504206263,
|
||||
'upload_date': '20170831',
|
||||
'duration': 7449.096,
|
||||
@ -183,6 +202,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Sideways (Prod. Mad Real)',
|
||||
'description': 'md5:d41d8cd98f00b204e9800998ecf8427e',
|
||||
'uploader': 'garyvee',
|
||||
'uploader_id': '2366352',
|
||||
'timestamp': 1488152409,
|
||||
'upload_date': '20170226',
|
||||
'duration': 207.012,
|
||||
@ -207,6 +227,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'title': 'Mezzo Valzer',
|
||||
'description': 'md5:4138d582f81866a530317bae316e8b61',
|
||||
'uploader': 'Giovanni Sarani',
|
||||
'uploader_id': '3352531',
|
||||
'timestamp': 1551394171,
|
||||
'upload_date': '20190228',
|
||||
'duration': 180.157,
|
||||
@ -221,114 +242,81 @@ class SoundcloudIE(InfoExtractor):
|
||||
}
|
||||
]
|
||||
|
||||
_API_BASE = 'https://api.soundcloud.com/'
|
||||
_API_V2_BASE = 'https://api-v2.soundcloud.com/'
|
||||
_BASE_URL = 'https://soundcloud.com/'
|
||||
_CLIENT_ID = 'BeGVhOrGmfboy1LtiHTQF6Ejpt9ULJCI'
|
||||
_IMAGE_REPL_RE = r'-([0-9a-z]+)\.jpg'
|
||||
|
||||
@staticmethod
|
||||
def _extract_urls(webpage):
|
||||
return [m.group('url') for m in re.finditer(
|
||||
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:w\.)?soundcloud\.com/player.+?)\1',
|
||||
webpage)]
|
||||
_ARTWORK_MAP = {
|
||||
'mini': 16,
|
||||
'tiny': 20,
|
||||
'small': 32,
|
||||
'badge': 47,
|
||||
't67x67': 67,
|
||||
'large': 100,
|
||||
't300x300': 300,
|
||||
'crop': 400,
|
||||
't500x500': 500,
|
||||
'original': 0,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def _resolv_url(cls, url):
|
||||
return 'https://api.soundcloud.com/resolve.json?url=' + url + '&client_id=' + cls._CLIENT_ID
|
||||
return SoundcloudIE._API_V2_BASE + 'resolve?url=' + url + '&client_id=' + cls._CLIENT_ID
|
||||
|
||||
def _extract_info_dict(self, info, full_title=None, quiet=False, secret_token=None):
|
||||
def _extract_info_dict(self, info, full_title=None, secret_token=None, version=2):
|
||||
track_id = compat_str(info['id'])
|
||||
title = info['title']
|
||||
name = full_title or track_id
|
||||
if quiet:
|
||||
self.report_extraction(name)
|
||||
thumbnail = info.get('artwork_url') or info.get('user', {}).get('avatar_url')
|
||||
if isinstance(thumbnail, compat_str):
|
||||
thumbnail = thumbnail.replace('-large', '-t500x500')
|
||||
username = try_get(info, lambda x: x['user']['username'], compat_str)
|
||||
|
||||
def extract_count(key):
|
||||
return int_or_none(info.get('%s_count' % key))
|
||||
|
||||
like_count = extract_count('favoritings')
|
||||
if like_count is None:
|
||||
like_count = extract_count('likes')
|
||||
|
||||
result = {
|
||||
'id': track_id,
|
||||
'uploader': username,
|
||||
'timestamp': unified_timestamp(info.get('created_at')),
|
||||
'title': title,
|
||||
'description': info.get('description'),
|
||||
'thumbnail': thumbnail,
|
||||
'duration': float_or_none(info.get('duration'), 1000),
|
||||
'webpage_url': info.get('permalink_url'),
|
||||
'license': info.get('license'),
|
||||
'view_count': extract_count('playback'),
|
||||
'like_count': like_count,
|
||||
'comment_count': extract_count('comment'),
|
||||
'repost_count': extract_count('reposts'),
|
||||
'genre': info.get('genre'),
|
||||
}
|
||||
track_base_url = self._API_BASE + 'tracks/%s' % track_id
|
||||
|
||||
format_urls = set()
|
||||
formats = []
|
||||
query = {'client_id': self._CLIENT_ID}
|
||||
if secret_token is not None:
|
||||
if secret_token:
|
||||
query['secret_token'] = secret_token
|
||||
if info.get('downloadable', False):
|
||||
# We can build a direct link to the song
|
||||
|
||||
if info.get('downloadable') and info.get('has_downloads_left'):
|
||||
format_url = update_url_query(
|
||||
'https://api.soundcloud.com/tracks/%s/download' % track_id, query)
|
||||
info.get('download_url') or track_base_url + '/download', query)
|
||||
format_urls.add(format_url)
|
||||
if version == 2:
|
||||
v1_info = self._download_json(
|
||||
track_base_url, track_id, query=query, fatal=False) or {}
|
||||
else:
|
||||
v1_info = info
|
||||
formats.append({
|
||||
'format_id': 'download',
|
||||
'ext': info.get('original_format', 'mp3'),
|
||||
'ext': v1_info.get('original_format') or 'mp3',
|
||||
'filesize': int_or_none(v1_info.get('original_content_size')),
|
||||
'url': format_url,
|
||||
'vcodec': 'none',
|
||||
'preference': 10,
|
||||
})
|
||||
|
||||
# Old API, does not work for some tracks (e.g.
|
||||
# https://soundcloud.com/giovannisarani/mezzo-valzer)
|
||||
format_dict = self._download_json(
|
||||
'https://api.soundcloud.com/i1/tracks/%s/streams' % track_id,
|
||||
track_id, 'Downloading track url', query=query, fatal=False)
|
||||
def invalid_url(url):
|
||||
return not url or url in format_urls or re.search(r'/(?:preview|playlist)/0/30/', url)
|
||||
|
||||
if format_dict:
|
||||
for key, stream_url in format_dict.items():
|
||||
if stream_url in format_urls:
|
||||
continue
|
||||
format_urls.add(stream_url)
|
||||
ext, abr = 'mp3', None
|
||||
mobj = re.search(r'_([^_]+)_(\d+)_url', key)
|
||||
if mobj:
|
||||
ext, abr = mobj.groups()
|
||||
abr = int(abr)
|
||||
if key.startswith('http'):
|
||||
stream_formats = [{
|
||||
'format_id': key,
|
||||
'ext': ext,
|
||||
'url': stream_url,
|
||||
}]
|
||||
elif key.startswith('rtmp'):
|
||||
# The url doesn't have an rtmp app, we have to extract the playpath
|
||||
url, path = stream_url.split('mp3:', 1)
|
||||
stream_formats = [{
|
||||
'format_id': key,
|
||||
'url': url,
|
||||
'play_path': 'mp3:' + path,
|
||||
'ext': 'flv',
|
||||
}]
|
||||
elif key.startswith('hls'):
|
||||
stream_formats = self._extract_m3u8_formats(
|
||||
stream_url, track_id, ext, entry_protocol='m3u8_native',
|
||||
m3u8_id=key, fatal=False)
|
||||
else:
|
||||
continue
|
||||
|
||||
if abr:
|
||||
for f in stream_formats:
|
||||
f['abr'] = abr
|
||||
|
||||
formats.extend(stream_formats)
|
||||
def add_format(f, protocol):
|
||||
mobj = re.search(r'\.(?P<abr>\d+)\.(?P<ext>[0-9a-z]{3,4})(?=[/?])', stream_url)
|
||||
if mobj:
|
||||
for k, v in mobj.groupdict().items():
|
||||
if not f.get(k):
|
||||
f[k] = v
|
||||
format_id_list = []
|
||||
if protocol:
|
||||
format_id_list.append(protocol)
|
||||
for k in ('ext', 'abr'):
|
||||
v = f.get(k)
|
||||
if v:
|
||||
format_id_list.append(v)
|
||||
abr = f.get('abr')
|
||||
if abr:
|
||||
f['abr'] = int(abr)
|
||||
f.update({
|
||||
'format_id': '_'.join(format_id_list),
|
||||
'protocol': 'm3u8_native' if protocol == 'hls' else 'http',
|
||||
})
|
||||
formats.append(f)
|
||||
|
||||
# New API
|
||||
transcodings = try_get(
|
||||
@ -337,129 +325,165 @@ class SoundcloudIE(InfoExtractor):
|
||||
if not isinstance(t, dict):
|
||||
continue
|
||||
format_url = url_or_none(t.get('url'))
|
||||
if not format_url:
|
||||
if not format_url or t.get('snipped') or '/preview/' in format_url:
|
||||
continue
|
||||
stream = self._download_json(
|
||||
update_url_query(format_url, query), track_id, fatal=False)
|
||||
format_url, track_id, query=query, fatal=False)
|
||||
if not isinstance(stream, dict):
|
||||
continue
|
||||
stream_url = url_or_none(stream.get('url'))
|
||||
if not stream_url:
|
||||
continue
|
||||
if stream_url in format_urls:
|
||||
if invalid_url(stream_url):
|
||||
continue
|
||||
format_urls.add(stream_url)
|
||||
protocol = try_get(t, lambda x: x['format']['protocol'], compat_str)
|
||||
stream_format = t.get('format') or {}
|
||||
protocol = stream_format.get('protocol')
|
||||
if protocol != 'hls' and '/hls' in format_url:
|
||||
protocol = 'hls'
|
||||
ext = None
|
||||
preset = str_or_none(t.get('preset'))
|
||||
if preset:
|
||||
ext = preset.split('_')[0]
|
||||
if ext not in KNOWN_EXTENSIONS:
|
||||
mimetype = try_get(
|
||||
t, lambda x: x['format']['mime_type'], compat_str)
|
||||
ext = mimetype2ext(mimetype) or 'mp3'
|
||||
format_id_list = []
|
||||
if protocol:
|
||||
format_id_list.append(protocol)
|
||||
format_id_list.append(ext)
|
||||
format_id = '_'.join(format_id_list)
|
||||
formats.append({
|
||||
if ext not in KNOWN_EXTENSIONS:
|
||||
ext = mimetype2ext(stream_format.get('mime_type'))
|
||||
add_format({
|
||||
'url': stream_url,
|
||||
'format_id': format_id,
|
||||
'ext': ext,
|
||||
'protocol': 'm3u8_native' if protocol == 'hls' else 'http',
|
||||
})
|
||||
}, 'http' if protocol == 'progressive' else protocol)
|
||||
|
||||
if not formats:
|
||||
# Old API, does not work for some tracks (e.g.
|
||||
# https://soundcloud.com/giovannisarani/mezzo-valzer)
|
||||
# and might serve preview URLs (e.g.
|
||||
# http://www.soundcloud.com/snbrn/ele)
|
||||
format_dict = self._download_json(
|
||||
track_base_url + '/streams', track_id,
|
||||
'Downloading track url', query=query, fatal=False) or {}
|
||||
|
||||
for key, stream_url in format_dict.items():
|
||||
if invalid_url(stream_url):
|
||||
continue
|
||||
format_urls.add(stream_url)
|
||||
mobj = re.search(r'(http|hls)_([^_]+)_(\d+)_url', key)
|
||||
if mobj:
|
||||
protocol, ext, abr = mobj.groups()
|
||||
add_format({
|
||||
'abr': abr,
|
||||
'ext': ext,
|
||||
'url': stream_url,
|
||||
}, protocol)
|
||||
|
||||
if not formats:
|
||||
# We fallback to the stream_url in the original info, this
|
||||
# cannot be always used, sometimes it can give an HTTP 404 error
|
||||
formats.append({
|
||||
'format_id': 'fallback',
|
||||
'url': update_url_query(info['stream_url'], query),
|
||||
'ext': 'mp3',
|
||||
})
|
||||
self._check_formats(formats, track_id)
|
||||
urlh = self._request_webpage(
|
||||
HEADRequest(info.get('stream_url') or track_base_url + '/stream'),
|
||||
track_id, query=query, fatal=False)
|
||||
if urlh:
|
||||
stream_url = urlh.geturl()
|
||||
if not invalid_url(stream_url):
|
||||
add_format({'url': stream_url}, 'http')
|
||||
|
||||
for f in formats:
|
||||
f['vcodec'] = 'none'
|
||||
|
||||
self._sort_formats(formats)
|
||||
result['formats'] = formats
|
||||
|
||||
return result
|
||||
user = info.get('user') or {}
|
||||
|
||||
thumbnails = []
|
||||
artwork_url = info.get('artwork_url')
|
||||
thumbnail = artwork_url or user.get('avatar_url')
|
||||
if isinstance(thumbnail, compat_str):
|
||||
if re.search(self._IMAGE_REPL_RE, thumbnail):
|
||||
for image_id, size in self._ARTWORK_MAP.items():
|
||||
i = {
|
||||
'id': image_id,
|
||||
'url': re.sub(self._IMAGE_REPL_RE, '-%s.jpg' % image_id, thumbnail),
|
||||
}
|
||||
if image_id == 'tiny' and not artwork_url:
|
||||
size = 18
|
||||
elif image_id == 'original':
|
||||
i['preference'] = 10
|
||||
if size:
|
||||
i.update({
|
||||
'width': size,
|
||||
'height': size,
|
||||
})
|
||||
thumbnails.append(i)
|
||||
else:
|
||||
thumbnails = [{'url': thumbnail}]
|
||||
|
||||
def extract_count(key):
|
||||
return int_or_none(info.get('%s_count' % key))
|
||||
|
||||
return {
|
||||
'id': track_id,
|
||||
'uploader': user.get('username'),
|
||||
'uploader_id': str_or_none(user.get('id')) or user.get('permalink'),
|
||||
'uploader_url': user.get('permalink_url'),
|
||||
'timestamp': unified_timestamp(info.get('created_at')),
|
||||
'title': title,
|
||||
'description': info.get('description'),
|
||||
'thumbnails': thumbnails,
|
||||
'duration': float_or_none(info.get('duration'), 1000),
|
||||
'webpage_url': info.get('permalink_url'),
|
||||
'license': info.get('license'),
|
||||
'view_count': extract_count('playback'),
|
||||
'like_count': extract_count('favoritings') or extract_count('likes'),
|
||||
'comment_count': extract_count('comment'),
|
||||
'repost_count': extract_count('reposts'),
|
||||
'genre': info.get('genre'),
|
||||
'formats': formats
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url, flags=re.VERBOSE)
|
||||
if mobj is None:
|
||||
raise ExtractorError('Invalid URL: %s' % url)
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
|
||||
track_id = mobj.group('track_id')
|
||||
new_info = {}
|
||||
|
||||
if track_id is not None:
|
||||
info_json_url = 'https://api.soundcloud.com/tracks/' + track_id + '.json?client_id=' + self._CLIENT_ID
|
||||
query = {
|
||||
'client_id': self._CLIENT_ID,
|
||||
}
|
||||
if track_id:
|
||||
info_json_url = self._API_V2_BASE + 'tracks/' + track_id
|
||||
full_title = track_id
|
||||
token = mobj.group('secret_token')
|
||||
if token:
|
||||
info_json_url += '&secret_token=' + token
|
||||
elif mobj.group('player'):
|
||||
query = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||
real_url = query['url'][0]
|
||||
# If the token is in the query of the original url we have to
|
||||
# manually add it
|
||||
if 'secret_token' in query:
|
||||
real_url += '?secret_token=' + query['secret_token'][0]
|
||||
return self.url_result(real_url)
|
||||
query['secret_token'] = token
|
||||
else:
|
||||
# extract uploader (which is in the url)
|
||||
uploader = mobj.group('uploader')
|
||||
# extract simple title (uploader + slug of song title)
|
||||
slug_title = mobj.group('title')
|
||||
full_title = resolve_title = '%s/%s' % mobj.group('uploader', 'title')
|
||||
token = mobj.group('token')
|
||||
full_title = resolve_title = '%s/%s' % (uploader, slug_title)
|
||||
if token:
|
||||
resolve_title += '/%s' % token
|
||||
info_json_url = self._resolv_url(self._BASE_URL + resolve_title)
|
||||
|
||||
webpage = self._download_webpage(url, full_title, fatal=False)
|
||||
if webpage:
|
||||
entries = self._parse_json(
|
||||
self._search_regex(
|
||||
r'var\s+c\s*=\s*(\[.+?\])\s*,\s*o\s*=Date\b', webpage,
|
||||
'data', default='[]'), full_title, fatal=False)
|
||||
if entries:
|
||||
for e in entries:
|
||||
if not isinstance(e, dict):
|
||||
continue
|
||||
if e.get('id') != 67:
|
||||
continue
|
||||
data = try_get(e, lambda x: x['data'][0], dict)
|
||||
if data:
|
||||
new_info = data
|
||||
break
|
||||
info_json_url = self._resolv_url(
|
||||
'https://soundcloud.com/%s' % resolve_title)
|
||||
|
||||
# Contains some additional info missing from new_info
|
||||
version = 2
|
||||
info = self._download_json(
|
||||
info_json_url, full_title, 'Downloading info JSON')
|
||||
info_json_url, full_title, 'Downloading info JSON', query=query, fatal=False)
|
||||
if not info:
|
||||
info = self._download_json(
|
||||
info_json_url.replace(self._API_V2_BASE, self._API_BASE),
|
||||
full_title, 'Downloading info JSON', query=query)
|
||||
version = 1
|
||||
|
||||
return self._extract_info_dict(
|
||||
merge_dicts(info, new_info), full_title, secret_token=token)
|
||||
return self._extract_info_dict(info, full_title, token, version)
|
||||
|
||||
|
||||
class SoundcloudPlaylistBaseIE(SoundcloudIE):
|
||||
@staticmethod
|
||||
def _extract_id(e):
|
||||
return compat_str(e['id']) if e.get('id') else None
|
||||
|
||||
def _extract_track_entries(self, tracks):
|
||||
return [
|
||||
self.url_result(
|
||||
track['permalink_url'], SoundcloudIE.ie_key(),
|
||||
video_id=self._extract_id(track))
|
||||
for track in tracks if track.get('permalink_url')]
|
||||
def _extract_track_entries(self, tracks, token=None):
|
||||
entries = []
|
||||
for track in tracks:
|
||||
track_id = str_or_none(track.get('id'))
|
||||
url = track.get('permalink_url')
|
||||
if not url:
|
||||
if not track_id:
|
||||
continue
|
||||
url = self._API_V2_BASE + 'tracks/' + track_id
|
||||
if token:
|
||||
url += '?secret_token=' + token
|
||||
entries.append(self.url_result(
|
||||
url, SoundcloudIE.ie_key(), track_id))
|
||||
return entries
|
||||
|
||||
|
||||
class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
|
||||
@ -480,41 +504,28 @@ class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
|
||||
# extract uploader (which is in the url)
|
||||
uploader = mobj.group('uploader')
|
||||
# extract simple title (uploader + slug of song title)
|
||||
slug_title = mobj.group('slug_title')
|
||||
full_title = '%s/sets/%s' % (uploader, slug_title)
|
||||
url = 'https://soundcloud.com/%s/sets/%s' % (uploader, slug_title)
|
||||
|
||||
full_title = '%s/sets/%s' % mobj.group('uploader', 'slug_title')
|
||||
token = mobj.group('token')
|
||||
if token:
|
||||
full_title += '/' + token
|
||||
url += '/' + token
|
||||
|
||||
resolv_url = self._resolv_url(url)
|
||||
info = self._download_json(resolv_url, full_title)
|
||||
info = self._download_json(self._resolv_url(
|
||||
self._BASE_URL + full_title), full_title)
|
||||
|
||||
if 'errors' in info:
|
||||
msgs = (compat_str(err['error_message']) for err in info['errors'])
|
||||
raise ExtractorError('unable to download video webpage: %s' % ','.join(msgs))
|
||||
|
||||
entries = self._extract_track_entries(info['tracks'])
|
||||
entries = self._extract_track_entries(info['tracks'], token)
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'entries': entries,
|
||||
'id': '%s' % info['id'],
|
||||
'title': info['title'],
|
||||
}
|
||||
return self.playlist_result(
|
||||
entries, str_or_none(info.get('id')), info.get('title'))
|
||||
|
||||
|
||||
class SoundcloudPagedPlaylistBaseIE(SoundcloudPlaylistBaseIE):
|
||||
_API_V2_BASE = 'https://api-v2.soundcloud.com'
|
||||
|
||||
def _extract_playlist(self, base_url, playlist_id, playlist_title):
|
||||
COMMON_QUERY = {
|
||||
'limit': 50,
|
||||
'limit': 2000000000,
|
||||
'client_id': self._CLIENT_ID,
|
||||
'linked_partitioning': '1',
|
||||
}
|
||||
@ -522,12 +533,13 @@ class SoundcloudPagedPlaylistBaseIE(SoundcloudPlaylistBaseIE):
|
||||
query = COMMON_QUERY.copy()
|
||||
query['offset'] = 0
|
||||
|
||||
next_href = base_url + '?' + compat_urllib_parse_urlencode(query)
|
||||
next_href = base_url
|
||||
|
||||
entries = []
|
||||
for i in itertools.count():
|
||||
response = self._download_json(
|
||||
next_href, playlist_id, 'Downloading track page %s' % (i + 1))
|
||||
next_href, playlist_id,
|
||||
'Downloading track page %s' % (i + 1), query=query)
|
||||
|
||||
collection = response['collection']
|
||||
|
||||
@ -546,9 +558,8 @@ class SoundcloudPagedPlaylistBaseIE(SoundcloudPlaylistBaseIE):
|
||||
continue
|
||||
return self.url_result(
|
||||
permalink_url,
|
||||
ie=SoundcloudIE.ie_key() if SoundcloudIE.suitable(permalink_url) else None,
|
||||
video_id=self._extract_id(cand),
|
||||
video_title=cand.get('title'))
|
||||
SoundcloudIE.ie_key() if SoundcloudIE.suitable(permalink_url) else None,
|
||||
str_or_none(cand.get('id')), cand.get('title'))
|
||||
|
||||
for e in collection:
|
||||
entry = resolve_entry((e, e.get('track'), e.get('playlist')))
|
||||
@ -559,11 +570,10 @@ class SoundcloudPagedPlaylistBaseIE(SoundcloudPlaylistBaseIE):
|
||||
if not next_href:
|
||||
break
|
||||
|
||||
parsed_next_href = compat_urlparse.urlparse(response['next_href'])
|
||||
qs = compat_urlparse.parse_qs(parsed_next_href.query)
|
||||
qs.update(COMMON_QUERY)
|
||||
next_href = compat_urlparse.urlunparse(
|
||||
parsed_next_href._replace(query=compat_urllib_parse_urlencode(qs, True)))
|
||||
next_href = response['next_href']
|
||||
parsed_next_href = compat_urlparse.urlparse(next_href)
|
||||
query = compat_urlparse.parse_qs(parsed_next_href.query)
|
||||
query.update(COMMON_QUERY)
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
@ -609,7 +619,7 @@ class SoundcloudUserIE(SoundcloudPagedPlaylistBaseIE):
|
||||
'url': 'https://soundcloud.com/jcv246/sets',
|
||||
'info_dict': {
|
||||
'id': '12982173',
|
||||
'title': 'Jordi / cv (Playlists)',
|
||||
'title': 'Jordi / cv (Sets)',
|
||||
},
|
||||
'playlist_mincount': 2,
|
||||
}, {
|
||||
@ -636,39 +646,29 @@ class SoundcloudUserIE(SoundcloudPagedPlaylistBaseIE):
|
||||
}]
|
||||
|
||||
_BASE_URL_MAP = {
|
||||
'all': '%s/stream/users/%%s' % SoundcloudPagedPlaylistBaseIE._API_V2_BASE,
|
||||
'tracks': '%s/users/%%s/tracks' % SoundcloudPagedPlaylistBaseIE._API_V2_BASE,
|
||||
'albums': '%s/users/%%s/albums' % SoundcloudPagedPlaylistBaseIE._API_V2_BASE,
|
||||
'sets': '%s/users/%%s/playlists' % SoundcloudPagedPlaylistBaseIE._API_V2_BASE,
|
||||
'reposts': '%s/stream/users/%%s/reposts' % SoundcloudPagedPlaylistBaseIE._API_V2_BASE,
|
||||
'likes': '%s/users/%%s/likes' % SoundcloudPagedPlaylistBaseIE._API_V2_BASE,
|
||||
'spotlight': '%s/users/%%s/spotlight' % SoundcloudPagedPlaylistBaseIE._API_V2_BASE,
|
||||
}
|
||||
|
||||
_TITLE_MAP = {
|
||||
'all': 'All',
|
||||
'tracks': 'Tracks',
|
||||
'albums': 'Albums',
|
||||
'sets': 'Playlists',
|
||||
'reposts': 'Reposts',
|
||||
'likes': 'Likes',
|
||||
'spotlight': 'Spotlight',
|
||||
'all': 'stream/users/%s',
|
||||
'tracks': 'users/%s/tracks',
|
||||
'albums': 'users/%s/albums',
|
||||
'sets': 'users/%s/playlists',
|
||||
'reposts': 'stream/users/%s/reposts',
|
||||
'likes': 'users/%s/likes',
|
||||
'spotlight': 'users/%s/spotlight',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
uploader = mobj.group('user')
|
||||
|
||||
url = 'https://soundcloud.com/%s/' % uploader
|
||||
resolv_url = self._resolv_url(url)
|
||||
user = self._download_json(
|
||||
resolv_url, uploader, 'Downloading user info')
|
||||
self._resolv_url(self._BASE_URL + uploader),
|
||||
uploader, 'Downloading user info')
|
||||
|
||||
resource = mobj.group('rsrc') or 'all'
|
||||
|
||||
return self._extract_playlist(
|
||||
self._BASE_URL_MAP[resource] % user['id'], compat_str(user['id']),
|
||||
'%s (%s)' % (user['username'], self._TITLE_MAP[resource]))
|
||||
self._API_V2_BASE + self._BASE_URL_MAP[resource] % user['id'],
|
||||
str_or_none(user.get('id')),
|
||||
'%s (%s)' % (user['username'], resource.capitalize()))
|
||||
|
||||
|
||||
class SoundcloudTrackStationIE(SoundcloudPagedPlaylistBaseIE):
|
||||
@ -678,7 +678,7 @@ class SoundcloudTrackStationIE(SoundcloudPagedPlaylistBaseIE):
|
||||
'url': 'https://soundcloud.com/stations/track/officialsundial/your-text',
|
||||
'info_dict': {
|
||||
'id': '286017854',
|
||||
'title': 'Track station: your-text',
|
||||
'title': 'Track station: your text',
|
||||
},
|
||||
'playlist_mincount': 47,
|
||||
}]
|
||||
@ -686,19 +686,17 @@ class SoundcloudTrackStationIE(SoundcloudPagedPlaylistBaseIE):
|
||||
def _real_extract(self, url):
|
||||
track_name = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, track_name)
|
||||
|
||||
track = self._download_json(self._resolv_url(url), track_name)
|
||||
track_id = self._search_regex(
|
||||
r'soundcloud:track-stations:(\d+)', webpage, 'track id')
|
||||
r'soundcloud:track-stations:(\d+)', track['id'], 'track id')
|
||||
|
||||
return self._extract_playlist(
|
||||
'%s/stations/soundcloud:track-stations:%s/tracks'
|
||||
% (self._API_V2_BASE, track_id),
|
||||
track_id, 'Track station: %s' % track_name)
|
||||
self._API_V2_BASE + 'stations/%s/tracks' % track['id'],
|
||||
track_id, 'Track station: %s' % track['title'])
|
||||
|
||||
|
||||
class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
|
||||
_VALID_URL = r'https?://api\.soundcloud\.com/playlists/(?P<id>[0-9]+)(?:/?\?secret_token=(?P<token>[^&]+?))?$'
|
||||
_VALID_URL = r'https?://api(?:-v2)?\.soundcloud\.com/playlists/(?P<id>[0-9]+)(?:/?\?secret_token=(?P<token>[^&]+?))?$'
|
||||
IE_NAME = 'soundcloud:playlist'
|
||||
_TESTS = [{
|
||||
'url': 'https://api.soundcloud.com/playlists/4110309',
|
||||
@ -713,29 +711,22 @@ class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
playlist_id = mobj.group('id')
|
||||
base_url = '%s//api.soundcloud.com/playlists/%s.json?' % (self.http_scheme(), playlist_id)
|
||||
|
||||
data_dict = {
|
||||
query = {
|
||||
'client_id': self._CLIENT_ID,
|
||||
}
|
||||
token = mobj.group('token')
|
||||
|
||||
if token:
|
||||
data_dict['secret_token'] = token
|
||||
query['secret_token'] = token
|
||||
|
||||
data = compat_urllib_parse_urlencode(data_dict)
|
||||
data = self._download_json(
|
||||
base_url + data, playlist_id, 'Downloading playlist')
|
||||
self._API_V2_BASE + 'playlists/' + playlist_id,
|
||||
playlist_id, 'Downloading playlist', query=query)
|
||||
|
||||
entries = self._extract_track_entries(data['tracks'])
|
||||
entries = self._extract_track_entries(data['tracks'], token)
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'id': playlist_id,
|
||||
'title': data.get('title'),
|
||||
'description': data.get('description'),
|
||||
'entries': entries,
|
||||
}
|
||||
return self.playlist_result(
|
||||
entries, playlist_id, data.get('title'), data.get('description'))
|
||||
|
||||
|
||||
class SoundcloudSearchIE(SearchInfoExtractor, SoundcloudIE):
|
||||
@ -753,18 +744,18 @@ class SoundcloudSearchIE(SearchInfoExtractor, SoundcloudIE):
|
||||
_SEARCH_KEY = 'scsearch'
|
||||
_MAX_RESULTS_PER_PAGE = 200
|
||||
_DEFAULT_RESULTS_PER_PAGE = 50
|
||||
_API_V2_BASE = 'https://api-v2.soundcloud.com'
|
||||
|
||||
def _get_collection(self, endpoint, collection_id, **query):
|
||||
limit = min(
|
||||
query.get('limit', self._DEFAULT_RESULTS_PER_PAGE),
|
||||
self._MAX_RESULTS_PER_PAGE)
|
||||
query['limit'] = limit
|
||||
query['client_id'] = self._CLIENT_ID
|
||||
query['linked_partitioning'] = '1'
|
||||
query['offset'] = 0
|
||||
data = compat_urllib_parse_urlencode(query)
|
||||
next_url = '{0}{1}?{2}'.format(self._API_V2_BASE, endpoint, data)
|
||||
query.update({
|
||||
'limit': limit,
|
||||
'client_id': self._CLIENT_ID,
|
||||
'linked_partitioning': 1,
|
||||
'offset': 0,
|
||||
})
|
||||
next_url = update_url_query(self._API_V2_BASE + endpoint, query)
|
||||
|
||||
collected_results = 0
|
||||
|
||||
@ -791,5 +782,5 @@ class SoundcloudSearchIE(SearchInfoExtractor, SoundcloudIE):
|
||||
break
|
||||
|
||||
def _get_n_results(self, query, n):
|
||||
tracks = self._get_collection('/search/tracks', query, limit=n, q=query)
|
||||
tracks = self._get_collection('search/tracks', query, limit=n, q=query)
|
||||
return self.playlist_result(tracks, playlist_title=query)
|
||||
|
@ -4,15 +4,10 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urllib_parse_urlparse
|
||||
)
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
compat_str,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
str_or_none,
|
||||
)
|
||||
|
||||
|
||||
@ -20,20 +15,20 @@ class STVPlayerIE(InfoExtractor):
|
||||
IE_NAME = 'stv:player'
|
||||
_VALID_URL = r'https?://player\.stv\.tv/(?P<type>episode|video)/(?P<id>[a-z0-9]{4})'
|
||||
_TEST = {
|
||||
'url': 'https://player.stv.tv/video/7srz/victoria/interview-with-the-cast-ahead-of-new-victoria/',
|
||||
'md5': '2ad867d4afd641fa14187596e0fbc91b',
|
||||
'url': 'https://player.stv.tv/video/4gwd/emmerdale/60-seconds-on-set-with-laura-norton/',
|
||||
'md5': '5adf9439c31d554f8be0707c7abe7e0a',
|
||||
'info_dict': {
|
||||
'id': '6016487034001',
|
||||
'id': '5333973339001',
|
||||
'ext': 'mp4',
|
||||
'upload_date': '20190321',
|
||||
'title': 'Interview with the cast ahead of new Victoria',
|
||||
'description': 'Nell Hudson and Lily Travers tell us what to expect in the new season of Victoria.',
|
||||
'timestamp': 1553179628,
|
||||
'upload_date': '20170301',
|
||||
'title': '60 seconds on set with Laura Norton',
|
||||
'description': "How many questions can Laura - a.k.a Kerry Wyatt - answer in 60 seconds? Let\'s find out!",
|
||||
'timestamp': 1488388054,
|
||||
'uploader_id': '1486976045',
|
||||
},
|
||||
'skip': 'this resource is unavailable outside of the UK',
|
||||
}
|
||||
_PUBLISHER_ID = '1486976045'
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1486976045/default_default/index.html?videoId=%s'
|
||||
_PTYPE_MAP = {
|
||||
'episode': 'episodes',
|
||||
'video': 'shortform',
|
||||
@ -41,54 +36,32 @@ class STVPlayerIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
ptype, video_id = re.match(self._VALID_URL, url).groups()
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
resp = self._download_json(
|
||||
'https://player.api.stv.tv/v1/%s/%s' % (self._PTYPE_MAP[ptype], video_id),
|
||||
video_id)
|
||||
|
||||
qs = compat_parse_qs(compat_urllib_parse_urlparse(self._search_regex(
|
||||
r'itemprop="embedURL"[^>]+href="([^"]+)',
|
||||
webpage, 'embed URL', default=None)).query)
|
||||
publisher_id = qs.get('publisherID', [None])[0] or self._PUBLISHER_ID
|
||||
result = resp['results']
|
||||
video = result['video']
|
||||
video_id = compat_str(video['id'])
|
||||
|
||||
player_attr = extract_attributes(self._search_regex(
|
||||
r'(<[^>]+class="bcplayer"[^>]+>)', webpage, 'player', default=None)) or {}
|
||||
subtitles = {}
|
||||
_subtitles = result.get('_subtitles') or {}
|
||||
for ext, sub_url in _subtitles.items():
|
||||
subtitles.setdefault('en', []).append({
|
||||
'ext': 'vtt' if ext == 'webvtt' else ext,
|
||||
'url': sub_url,
|
||||
})
|
||||
|
||||
info = {}
|
||||
duration = ref_id = series = video_id = None
|
||||
api_ref_id = player_attr.get('data-player-api-refid')
|
||||
if api_ref_id:
|
||||
resp = self._download_json(
|
||||
'https://player.api.stv.tv/v1/%s/%s' % (self._PTYPE_MAP[ptype], api_ref_id),
|
||||
api_ref_id, fatal=False)
|
||||
if resp:
|
||||
result = resp.get('results') or {}
|
||||
video = result.get('video') or {}
|
||||
video_id = str_or_none(video.get('id'))
|
||||
ref_id = video.get('guid')
|
||||
duration = video.get('length')
|
||||
programme = result.get('programme') or {}
|
||||
series = programme.get('name') or programme.get('shortName')
|
||||
subtitles = {}
|
||||
_subtitles = result.get('_subtitles') or {}
|
||||
for ext, sub_url in _subtitles.items():
|
||||
subtitles.setdefault('en', []).append({
|
||||
'ext': 'vtt' if ext == 'webvtt' else ext,
|
||||
'url': sub_url,
|
||||
})
|
||||
info.update({
|
||||
'description': result.get('summary'),
|
||||
'subtitles': subtitles,
|
||||
'view_count': int_or_none(result.get('views')),
|
||||
})
|
||||
if not video_id:
|
||||
video_id = qs.get('videoId', [None])[0] or self._search_regex(
|
||||
r'<link\s+itemprop="url"\s+href="(\d+)"',
|
||||
webpage, 'video id', default=None) or 'ref:' + (ref_id or player_attr['data-refid'])
|
||||
programme = result.get('programme') or {}
|
||||
|
||||
info.update({
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'duration': float_or_none(duration or player_attr.get('data-duration'), 1000),
|
||||
'id': video_id,
|
||||
'url': self.BRIGHTCOVE_URL_TEMPLATE % video_id,
|
||||
'description': result.get('summary'),
|
||||
'duration': float_or_none(video.get('length'), 1000),
|
||||
'subtitles': subtitles,
|
||||
'view_count': int_or_none(result.get('views')),
|
||||
'series': programme.get('name') or programme.get('shortName'),
|
||||
'ie_key': 'BrightcoveNew',
|
||||
'series': series or player_attr.get('data-programme-name'),
|
||||
'url': 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id),
|
||||
})
|
||||
return info
|
||||
}
|
||||
|
@ -48,6 +48,16 @@ class TeachableBaseIE(InfoExtractor):
|
||||
'https://%s/sign_in' % site, None,
|
||||
'Downloading %s login page' % site)
|
||||
|
||||
def is_logged(webpage):
|
||||
return any(re.search(p, webpage) for p in (
|
||||
r'class=["\']user-signout',
|
||||
r'<a[^>]+\bhref=["\']/sign_out',
|
||||
r'Log\s+[Oo]ut\s*<'))
|
||||
|
||||
if is_logged(login_page):
|
||||
self._logged_in = True
|
||||
return
|
||||
|
||||
login_url = compat_str(urlh.geturl())
|
||||
|
||||
login_form = self._hidden_inputs(login_page)
|
||||
@ -78,10 +88,7 @@ class TeachableBaseIE(InfoExtractor):
|
||||
'Go to https://%s/ and accept.' % (site, site), expected=True)
|
||||
|
||||
# Successful login
|
||||
if any(re.search(p, response) for p in (
|
||||
r'class=["\']user-signout',
|
||||
r'<a[^>]+\bhref=["\']/sign_out',
|
||||
r'>\s*Log out\s*<')):
|
||||
if is_logged(response):
|
||||
self._logged_in = True
|
||||
return
|
||||
|
||||
|
@ -84,6 +84,19 @@ class TeamcocoIE(TurnerBaseIE):
|
||||
'only_matching': True,
|
||||
}
|
||||
]
|
||||
_RECORD_TEMPL = '''id
|
||||
title
|
||||
teaser
|
||||
publishOn
|
||||
thumb {
|
||||
preview
|
||||
}
|
||||
tags {
|
||||
name
|
||||
}
|
||||
duration
|
||||
turnerMediaId
|
||||
turnerMediaAuthToken'''
|
||||
|
||||
def _graphql_call(self, query_template, object_type, object_id):
|
||||
find_object = 'find' + object_type
|
||||
@ -98,36 +111,36 @@ class TeamcocoIE(TurnerBaseIE):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
response = self._graphql_call('''{
|
||||
%s(slug: "%s") {
|
||||
%%s(slug: "%%s") {
|
||||
... on RecordSlug {
|
||||
record {
|
||||
%s
|
||||
}
|
||||
}
|
||||
... on PageSlug {
|
||||
child {
|
||||
id
|
||||
title
|
||||
teaser
|
||||
publishOn
|
||||
thumb {
|
||||
preview
|
||||
}
|
||||
file {
|
||||
url
|
||||
}
|
||||
tags {
|
||||
name
|
||||
}
|
||||
duration
|
||||
turnerMediaId
|
||||
turnerMediaAuthToken
|
||||
}
|
||||
}
|
||||
... on NotFoundSlug {
|
||||
status
|
||||
}
|
||||
}
|
||||
}''', 'Slug', display_id)
|
||||
}''' % self._RECORD_TEMPL, 'Slug', display_id)
|
||||
if response.get('status'):
|
||||
raise ExtractorError('This video is no longer available.', expected=True)
|
||||
|
||||
record = response['record']
|
||||
child = response.get('child')
|
||||
if child:
|
||||
record = self._graphql_call('''{
|
||||
%%s(id: "%%s") {
|
||||
... on Video {
|
||||
%s
|
||||
}
|
||||
}
|
||||
}''' % self._RECORD_TEMPL, 'Record', child['id'])
|
||||
else:
|
||||
record = response['record']
|
||||
video_id = record['id']
|
||||
|
||||
info = {
|
||||
@ -150,25 +163,21 @@ class TeamcocoIE(TurnerBaseIE):
|
||||
'accessTokenType': 'jws',
|
||||
}))
|
||||
else:
|
||||
d = self._download_json(
|
||||
video_sources = self._download_json(
|
||||
'https://teamcoco.com/_truman/d/' + video_id,
|
||||
video_id, fatal=False) or {}
|
||||
video_sources = d.get('meta') or {}
|
||||
if not video_sources:
|
||||
video_sources = self._graphql_call('''{
|
||||
%s(id: "%s") {
|
||||
src
|
||||
}
|
||||
}''', 'RecordVideoSource', video_id) or {}
|
||||
video_id)['meta']['src']
|
||||
if isinstance(video_sources, dict):
|
||||
video_sources = video_sources.values()
|
||||
|
||||
formats = []
|
||||
get_quality = qualities(['low', 'sd', 'hd', 'uhd'])
|
||||
for format_id, src in video_sources.get('src', {}).items():
|
||||
for src in video_sources:
|
||||
if not isinstance(src, dict):
|
||||
continue
|
||||
src_url = src.get('src')
|
||||
if not src_url:
|
||||
continue
|
||||
format_id = src.get('label')
|
||||
ext = determine_ext(src_url, mimetype2ext(src.get('type')))
|
||||
if format_id == 'hls' or ext == 'm3u8':
|
||||
# compat_urllib_parse.urljoin does not work here
|
||||
@ -190,9 +199,6 @@ class TeamcocoIE(TurnerBaseIE):
|
||||
'format_id': format_id,
|
||||
'quality': get_quality(format_id),
|
||||
})
|
||||
if not formats:
|
||||
formats = self._extract_m3u8_formats(
|
||||
record['file']['url'], video_id, 'mp4', fatal=False)
|
||||
self._sort_formats(formats)
|
||||
info['formats'] = formats
|
||||
|
||||
|
@ -182,20 +182,29 @@ class TEDIE(InfoExtractor):
|
||||
|
||||
title = talk_info['title'].strip()
|
||||
|
||||
native_downloads = try_get(
|
||||
talk_info,
|
||||
(lambda x: x['downloads']['nativeDownloads'],
|
||||
lambda x: x['nativeDownloads']),
|
||||
dict) or {}
|
||||
downloads = talk_info.get('downloads') or {}
|
||||
native_downloads = downloads.get('nativeDownloads') or talk_info.get('nativeDownloads') or {}
|
||||
|
||||
formats = [{
|
||||
'url': format_url,
|
||||
'format_id': format_id,
|
||||
'format': format_id,
|
||||
} for (format_id, format_url) in native_downloads.items() if format_url is not None]
|
||||
|
||||
subtitled_downloads = downloads.get('subtitledDownloads') or {}
|
||||
for lang, subtitled_download in subtitled_downloads.items():
|
||||
for q in self._NATIVE_FORMATS:
|
||||
q_url = subtitled_download.get(q)
|
||||
if not q_url:
|
||||
continue
|
||||
formats.append({
|
||||
'url': q_url,
|
||||
'format_id': '%s-%s' % (q, lang),
|
||||
'language': lang,
|
||||
})
|
||||
|
||||
if formats:
|
||||
for f in formats:
|
||||
finfo = self._NATIVE_FORMATS.get(f['format_id'])
|
||||
finfo = self._NATIVE_FORMATS.get(f['format_id'].split('-')[0])
|
||||
if finfo:
|
||||
f.update(finfo)
|
||||
|
||||
@ -215,34 +224,7 @@ class TEDIE(InfoExtractor):
|
||||
|
||||
http_url = None
|
||||
for format_id, resources in resources_.items():
|
||||
if format_id == 'h264':
|
||||
for resource in resources:
|
||||
h264_url = resource.get('file')
|
||||
if not h264_url:
|
||||
continue
|
||||
bitrate = int_or_none(resource.get('bitrate'))
|
||||
formats.append({
|
||||
'url': h264_url,
|
||||
'format_id': '%s-%sk' % (format_id, bitrate),
|
||||
'tbr': bitrate,
|
||||
})
|
||||
if re.search(r'\d+k', h264_url):
|
||||
http_url = h264_url
|
||||
elif format_id == 'rtmp':
|
||||
streamer = talk_info.get('streamer')
|
||||
if not streamer:
|
||||
continue
|
||||
for resource in resources:
|
||||
formats.append({
|
||||
'format_id': '%s-%s' % (format_id, resource.get('name')),
|
||||
'url': streamer,
|
||||
'play_path': resource['file'],
|
||||
'ext': 'flv',
|
||||
'width': int_or_none(resource.get('width')),
|
||||
'height': int_or_none(resource.get('height')),
|
||||
'tbr': int_or_none(resource.get('bitrate')),
|
||||
})
|
||||
elif format_id == 'hls':
|
||||
if format_id == 'hls':
|
||||
if not isinstance(resources, dict):
|
||||
continue
|
||||
stream_url = url_or_none(resources.get('stream'))
|
||||
@ -251,6 +233,36 @@ class TEDIE(InfoExtractor):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
stream_url, video_name, 'mp4', m3u8_id=format_id,
|
||||
fatal=False))
|
||||
else:
|
||||
if not isinstance(resources, list):
|
||||
continue
|
||||
if format_id == 'h264':
|
||||
for resource in resources:
|
||||
h264_url = resource.get('file')
|
||||
if not h264_url:
|
||||
continue
|
||||
bitrate = int_or_none(resource.get('bitrate'))
|
||||
formats.append({
|
||||
'url': h264_url,
|
||||
'format_id': '%s-%sk' % (format_id, bitrate),
|
||||
'tbr': bitrate,
|
||||
})
|
||||
if re.search(r'\d+k', h264_url):
|
||||
http_url = h264_url
|
||||
elif format_id == 'rtmp':
|
||||
streamer = talk_info.get('streamer')
|
||||
if not streamer:
|
||||
continue
|
||||
for resource in resources:
|
||||
formats.append({
|
||||
'format_id': '%s-%s' % (format_id, resource.get('name')),
|
||||
'url': streamer,
|
||||
'play_path': resource['file'],
|
||||
'ext': 'flv',
|
||||
'width': int_or_none(resource.get('width')),
|
||||
'height': int_or_none(resource.get('height')),
|
||||
'tbr': int_or_none(resource.get('bitrate')),
|
||||
})
|
||||
|
||||
m3u8_formats = list(filter(
|
||||
lambda f: f.get('protocol') == 'm3u8' and f.get('vcodec') != 'none',
|
||||
|
@ -4,21 +4,25 @@ from __future__ import unicode_literals
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
remove_end,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
class TelegraafIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?telegraaf\.nl/tv/(?:[^/]+/)+(?P<id>\d+)/[^/]+\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?telegraaf\.nl/video/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.telegraaf.nl/tv/nieuws/binnenland/24353229/__Tikibad_ontruimd_wegens_brand__.html',
|
||||
'url': 'https://www.telegraaf.nl/video/734366489/historisch-scheepswrak-slaat-na-100-jaar-los',
|
||||
'info_dict': {
|
||||
'id': '24353229',
|
||||
'id': 'gaMItuoSeUg2',
|
||||
'ext': 'mp4',
|
||||
'title': 'Tikibad ontruimd wegens brand',
|
||||
'description': 'md5:05ca046ff47b931f9b04855015e163a4',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 33,
|
||||
'title': 'Historisch scheepswrak slaat na 100 jaar los',
|
||||
'description': 'md5:6f53b7c4f55596722ac24d6c0ec00cfb',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 55,
|
||||
'timestamp': 1572805527,
|
||||
'upload_date': '20191103',
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
@ -27,23 +31,30 @@ class TelegraafIE(InfoExtractor):
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
article_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
video_id = self._download_json(
|
||||
'https://www.telegraaf.nl/graphql', article_id, query={
|
||||
'query': '''{
|
||||
article(uid: %s) {
|
||||
videos {
|
||||
videoId
|
||||
}
|
||||
}
|
||||
}''' % article_id,
|
||||
})['data']['article']['videos'][0]['videoId']
|
||||
|
||||
player_url = self._html_search_regex(
|
||||
r'<iframe[^>]+src="([^"]+")', webpage, 'player URL')
|
||||
player_page = self._download_webpage(
|
||||
player_url, video_id, note='Download player webpage')
|
||||
playlist_url = self._search_regex(
|
||||
r'playlist\s*:\s*"([^"]+)"', player_page, 'playlist URL')
|
||||
playlist_data = self._download_json(playlist_url, video_id)
|
||||
item = self._download_json(
|
||||
'https://content.tmgvideo.nl/playlist/item=%s/playlist.json' % video_id,
|
||||
video_id)['items'][0]
|
||||
title = item['title']
|
||||
|
||||
item = playlist_data['items'][0]
|
||||
formats = []
|
||||
locations = item['locations']
|
||||
locations = item.get('locations') or {}
|
||||
for location in locations.get('adaptive', []):
|
||||
manifest_url = location['src']
|
||||
manifest_url = location.get('src')
|
||||
if not manifest_url:
|
||||
continue
|
||||
ext = determine_ext(manifest_url)
|
||||
if ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
@ -54,25 +65,25 @@ class TelegraafIE(InfoExtractor):
|
||||
else:
|
||||
self.report_warning('Unknown adaptive format %s' % ext)
|
||||
for location in locations.get('progressive', []):
|
||||
src = try_get(location, lambda x: x['sources'][0]['src'])
|
||||
if not src:
|
||||
continue
|
||||
label = location.get('label')
|
||||
formats.append({
|
||||
'url': location['sources'][0]['src'],
|
||||
'width': location.get('width'),
|
||||
'height': location.get('height'),
|
||||
'format_id': 'http-%s' % location['label'],
|
||||
'url': src,
|
||||
'width': int_or_none(location.get('width')),
|
||||
'height': int_or_none(location.get('height')),
|
||||
'format_id': 'http' + ('-%s' % label if label else ''),
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
title = remove_end(self._og_search_title(webpage), ' - VIDEO')
|
||||
description = self._og_search_description(webpage)
|
||||
duration = item.get('duration')
|
||||
thumbnail = item.get('poster')
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'description': item.get('description'),
|
||||
'formats': formats,
|
||||
'duration': duration,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': int_or_none(item.get('duration')),
|
||||
'thumbnail': item.get('poster'),
|
||||
'timestamp': parse_iso8601(item.get('datecreated'), ' '),
|
||||
}
|
||||
|
@ -7,6 +7,7 @@ from ..utils import (
|
||||
int_or_none,
|
||||
smuggle_url,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
)
|
||||
|
||||
|
||||
@ -22,7 +23,13 @@ class TeleQuebecBaseIE(InfoExtractor):
|
||||
|
||||
|
||||
class TeleQuebecIE(TeleQuebecBaseIE):
|
||||
_VALID_URL = r'https?://zonevideo\.telequebec\.tv/media/(?P<id>\d+)'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
zonevideo\.telequebec\.tv/media|
|
||||
coucou\.telequebec\.tv/videos
|
||||
)/(?P<id>\d+)
|
||||
'''
|
||||
_TESTS = [{
|
||||
# available till 01.01.2023
|
||||
'url': 'http://zonevideo.telequebec.tv/media/37578/un-petit-choc-et-puis-repart/un-chef-a-la-cabane',
|
||||
@ -41,6 +48,9 @@ class TeleQuebecIE(TeleQuebecBaseIE):
|
||||
# no description
|
||||
'url': 'http://zonevideo.telequebec.tv/media/30261',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://coucou.telequebec.tv/videos/41788/idee-de-genie/l-heure-du-bain',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -61,6 +71,52 @@ class TeleQuebecIE(TeleQuebecBaseIE):
|
||||
return info
|
||||
|
||||
|
||||
class TeleQuebecSquatIE(InfoExtractor):
|
||||
_VALID_URL = r'https://squat\.telequebec\.tv/videos/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://squat.telequebec.tv/videos/9314',
|
||||
'info_dict': {
|
||||
'id': 'd59ae78112d542e793d83cc9d3a5b530',
|
||||
'ext': 'mp4',
|
||||
'title': 'Poupeflekta',
|
||||
'description': 'md5:2f0718f8d2f8fece1646ee25fb7bce75',
|
||||
'duration': 1351,
|
||||
'timestamp': 1569057600,
|
||||
'upload_date': '20190921',
|
||||
'series': 'Miraculous : Les Aventures de Ladybug et Chat Noir',
|
||||
'season': 'Saison 3',
|
||||
'season_number': 3,
|
||||
'episode_number': 57,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
video = self._download_json(
|
||||
'https://squat.api.telequebec.tv/v1/videos/%s' % video_id,
|
||||
video_id)
|
||||
|
||||
media_id = video['sourceId']
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': 'http://zonevideo.telequebec.tv/media/%s' % media_id,
|
||||
'ie_key': TeleQuebecIE.ie_key(),
|
||||
'id': media_id,
|
||||
'title': video.get('titre'),
|
||||
'description': video.get('description'),
|
||||
'timestamp': unified_timestamp(video.get('datePublication')),
|
||||
'series': video.get('container'),
|
||||
'season': video.get('saison'),
|
||||
'season_number': int_or_none(video.get('noSaison')),
|
||||
'episode_number': int_or_none(video.get('episode')),
|
||||
}
|
||||
|
||||
|
||||
class TeleQuebecEmissionIE(TeleQuebecBaseIE):
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
|
55
youtube_dl/extractor/tenplay.py
Normal file
55
youtube_dl/extractor/tenplay.py
Normal file
@ -0,0 +1,55 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
parse_age_limit,
|
||||
parse_iso8601,
|
||||
smuggle_url,
|
||||
)
|
||||
|
||||
|
||||
class TenPlayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?10play\.com\.au/[^/]+/episodes/[^/]+/[^/]+/(?P<id>tpv\d{6}[a-z]{5})'
|
||||
_TEST = {
|
||||
'url': 'https://10play.com.au/masterchef/episodes/season-1/masterchef-s1-ep-1/tpv190718kwzga',
|
||||
'info_dict': {
|
||||
'id': '6060533435001',
|
||||
'ext': 'mp4',
|
||||
'title': 'MasterChef - S1 Ep. 1',
|
||||
'description': 'md5:4fe7b78e28af8f2d900cd20d900ef95c',
|
||||
'age_limit': 10,
|
||||
'timestamp': 1240828200,
|
||||
'upload_date': '20090427',
|
||||
'uploader_id': '2199827728001',
|
||||
},
|
||||
'params': {
|
||||
'format': 'bestvideo',
|
||||
'skip_download': True,
|
||||
}
|
||||
}
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'https://players.brightcove.net/2199827728001/cN6vRtRQt_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
content_id = self._match_id(url)
|
||||
data = self._download_json(
|
||||
'https://10play.com.au/api/video/' + content_id, content_id)
|
||||
video = data.get('video') or {}
|
||||
metadata = data.get('metaData') or {}
|
||||
brightcove_id = video.get('videoId') or metadata['showContentVideoId']
|
||||
brightcove_url = smuggle_url(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
|
||||
{'geo_countries': ['AU']})
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': brightcove_url,
|
||||
'id': content_id,
|
||||
'title': video.get('title') or metadata.get('pageContentName') or metadata.get('showContentName'),
|
||||
'description': video.get('description'),
|
||||
'age_limit': parse_age_limit(video.get('showRatingClassification') or metadata.get('showProgramClassification')),
|
||||
'series': metadata.get('showName'),
|
||||
'season': metadata.get('showContentSeason'),
|
||||
'timestamp': parse_iso8601(metadata.get('contentPublishDate') or metadata.get('pageContentPublishDate')),
|
||||
'ie_key': 'BrightcoveNew',
|
||||
}
|
@ -3,7 +3,7 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .ooyala import OoyalaIE
|
||||
from ..utils import extract_attributes
|
||||
|
||||
|
||||
class TheSunIE(InfoExtractor):
|
||||
@ -16,6 +16,7 @@ class TheSunIE(InfoExtractor):
|
||||
},
|
||||
'playlist_count': 2,
|
||||
}
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
article_id = self._match_id(url)
|
||||
@ -23,10 +24,15 @@ class TheSunIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, article_id)
|
||||
|
||||
entries = []
|
||||
for ooyala_id in re.findall(
|
||||
r'<[^>]+\b(?:id\s*=\s*"thesun-ooyala-player-|data-content-id\s*=\s*")([^"]+)',
|
||||
for video in re.findall(
|
||||
r'<video[^>]+data-video-id-pending=[^>]+>',
|
||||
webpage):
|
||||
entries.append(OoyalaIE._build_url_result(ooyala_id))
|
||||
attrs = extract_attributes(video)
|
||||
video_id = attrs['data-video-id-pending']
|
||||
account_id = attrs.get('data-account', '5067014667001')
|
||||
entries.append(self.url_result(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % (account_id, video_id),
|
||||
'BrightcoveNew', video_id))
|
||||
|
||||
return self.playlist_result(
|
||||
entries, article_id, self._og_search_title(webpage, fatal=False))
|
||||
|
@ -1,36 +0,0 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_b64decode,
|
||||
compat_parse_qs,
|
||||
)
|
||||
|
||||
|
||||
class TutvIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?tu\.tv/videos/(?P<id>[^/?]+)'
|
||||
_TEST = {
|
||||
'url': 'http://tu.tv/videos/robots-futbolistas',
|
||||
'md5': '0cd9e28ad270488911b0d2a72323395d',
|
||||
'info_dict': {
|
||||
'id': '2973058',
|
||||
'ext': 'mp4',
|
||||
'title': 'Robots futbolistas',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
internal_id = self._search_regex(r'codVideo=([0-9]+)', webpage, 'internal video ID')
|
||||
|
||||
data_content = self._download_webpage(
|
||||
'http://tu.tv/flvurl.php?codVideo=%s' % internal_id, video_id, 'Downloading video info')
|
||||
video_url = compat_b64decode(compat_parse_qs(data_content)['kpt'][0]).decode('utf-8')
|
||||
|
||||
return {
|
||||
'id': internal_id,
|
||||
'url': video_url,
|
||||
'title': self._og_search_title(webpage),
|
||||
}
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user