Prevent HTTP 301 for YouTube playlist continuations
When a YouTube playlist or channel listing has more than one page of videos, the continuation URLs specify `youtube.com` instead of `www.youtube.com`. This causes an unnecessary HTTP round-trip for each continuation page the extractor accesses. **Example** <code> youtube-dl -s --print-traffic https://www.youtube.com/channel/UCBR8-60-B28hp2BmDPdntcQ </code> **Before** <code> GET /playlist?list=UUBR8-60-B28hp2BmDPdntcQ&disable_polymer=true Host: www.youtube.com HTTP/1.1 200 OK GET /browse_ajax?action_continuation=1&continuation=4qmFsgIsEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoOZWdaUVZEcERSMUUlM0Q%253D&disable_polymer=true Host: youtube.com HTTP/1.1 301 Moved Permanently Location: https://www.youtube.com/browse_ajax?action_continuation=1&continuation=4qmFsgIsEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoOZWdaUVZEcERSMUUlM0Q%253D&disable_polymer=true GET /browse_ajax?action_continuation=1&continuation=4qmFsgIsEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoOZWdaUVZEcERSMUUlM0Q%253D&disable_polymer=true Host: www.youtube.com HTTP/1.1 200 OK GET /browse_ajax?action_continuation=1&continuation=4qmFsgIqEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoMZWdkUVZEcERUV2RD&disable_polymer=true Host: youtube.com HTTP/1.1 301 Moved Permanently Location: https://www.youtube.com/browse_ajax?action_continuation=1&continuation=4qmFsgIqEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoMZWdkUVZEcERUV2RD&disable_polymer=true GET /browse_ajax?action_continuation=1&continuation=4qmFsgIqEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoMZWdkUVZEcERUV2RD&disable_polymer=true Host: www.youtube.com HTTP/1.1 200 OK GET /browse_ajax?action_continuation=1&continuation=4qmFsgIqEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoMZWdkUVZEcERTM2RE&disable_polymer=true Host: youtube.com HTTP/1.1 301 Moved Permanently Location: https://www.youtube.com/browse_ajax?action_continuation=1&continuation=4qmFsgIqEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoMZWdkUVZEcERTM2RE&disable_polymer=true GET /browse_ajax?action_continuation=1&continuation=4qmFsgIqEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoMZWdkUVZEcERTM2RE&disable_polymer=true Host: www.youtube.com HTTP/1.1 200 OK </code> **After** <code> GET /playlist?list=UUBR8-60-B28hp2BmDPdntcQ&disable_polymer=true Host: www.youtube.com HTTP/1.1 200 OK GET /browse_ajax?action_continuation=1&continuation=4qmFsgIsEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoOZWdaUVZEcERSMUUlM0Q%253D&disable_polymer=true Host: www.youtube.com HTTP/1.1 200 OK GET /browse_ajax?action_continuation=1&continuation=4qmFsgIqEhpWTFVVQlI4LTYwLUIyOGhwMkJtRFBkbnRjURoMZWdkUVZEcERUV2RD&disable_polymer=true Host: www.youtube.com HTTP/1.1 200 OK </code>
This commit is contained in:
parent
9a7e5cb88a
commit
bd1340d294
@ -303,7 +303,7 @@ class YoutubeEntryListBaseInfoExtractor(YoutubeBaseInfoExtractor):
|
||||
# Downloading page may result in intermittent 5xx HTTP error
|
||||
# that is usually worked around with a retry
|
||||
more = self._download_json(
|
||||
'https://youtube.com/%s' % mobj.group('more'), playlist_id,
|
||||
'https://www.youtube.com/%s' % mobj.group('more'), playlist_id,
|
||||
'Downloading page #%s%s'
|
||||
% (page_num, ' (retry #%d)' % count if count else ''),
|
||||
transform_source=uppercase_escape,
|
||||
|
Loading…
x
Reference in New Issue
Block a user