Regex YouTube Parser
Over the next several days I’m going to post a series of Regex strings. These Regex strings can be used to parse input for different links. I’m using PHP in my examples (you may need to tweak the Regex to work in another language).
Today’s Regex is used to parse a string for YouTube links and extract the Video_ID. Once I have the Video_ID I’m able to create a standard embed code from nearly any YouTube URL.
Supported Links:
The following Regex supports these YouTube links and embed code snippets.
Short URL : http://youtu.be/OxWMsxa5uVk Normal URL: http://www.youtube.com/watch?v=OxWMsxa5uVk&t=28s HTTPS URL : https://www.youtube.com/watch?v=OxWMsxa5uVk&feature=g-logo New Embed : <iframe width="560" height="315" src="http://www.youtube.com/ embed/OxWMsxa5uVk" frameborder="0" allowfullscreen></iframe> Old Embed : <object width="1280" height="720"><param name="movie" value="http://www.youtube.com/v/OxWMsxa5uVk? version=3&hl=en_US&rel=0"></param> <param name="allowFullScreen" value="true"></param> <param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/OxWMsxa5uVk?version=3& hl=en_US&rel=0" type="application/x-shockwave-flash" width="1280" height="720" allowscriptaccess="always" allowfullscreen="true"> </embed></object>
I won’t spend a lot of time explaining the Regex because I’ve broken it up Regex and commented each line.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | $regexstr = '~ # Match Youtube link and embed code (?: # Group to match embed codes (?:<iframe [^>]*src=")? # If iframe match up to first quote of src |(?: # Group to match if older embed (?:<object .*>)? # Match opening Object tag (?:<param .*</param>)* # Match all param tags (?:<embed [^>]*src=")? # Match embed tag to the first quote of src )? # End older embed code group )? # End embed code groups (?: # Group youtube url https?:\/\/ # Either http or https (?:[\w]+\.)* # Optional subdomains (?: # Group host alternatives. youtu\.be/ # Either youtu.be, | youtube\.com # or youtube.com | youtube-nocookie\.com # or youtube-nocookie.com ) # End Host Group (?:\S*[^\w\-\s])? # Extra stuff up to VIDEO_ID ([\w\-]{11}) # $1: VIDEO_ID is numeric [^\s]* # Not a space ) # End group "? # Match end quote if part of src (?:[^>]*>)? # Match any extra stuff up to close brace (?: # Group to match last embed code </iframe> # Match the end of the iframe |</embed></object> # or Match the end of the older embed )? # End Group of last bit of embed code ~ix'; |
Usage Example:
This example function takes an input string and uses the above Regex to parse off the Video_ID ($1) and add it to $iframestr to create a standard embed code.
1 2 3 4 5 6 | function ParsePostYouTube($string){ $regexstr = <<REGEX FROM ABOVE>>; $iframestr = ' <p><iframe width="500" height="284" src="http://www.youtube.com/embed/$1?wmode=transparent" frameborder="0" allowfullscreen></iframe></p> '; return preg_replace($regexstr, $iframestr, $string); } |
I got here looking for a regexp to check if youtube embed code is legal
Great post bro
This post is very good in its quality and accuracy.
Thanks a lot!
Oups!
When url comes from a “list” the script parses the list id instead of the video id.
i.e.: http://www.youtube.com/watch?v=15PLNuVDZU0&list=UUBNDrWLYE9U3Jqfv6Z_R9kA&index=5