Using stripslashes within preg_replace with the “e” modifier
First published on June 11, 2011
Recently, I had to do some formatting on existing HTML code, to transform nested paragraphs within blockquotes into individual blockquotes, for XML processing. For example, turning this:
... lots of different HTML before this
<blockquote><p>This is a quote with a <a href="http://www.theblog.ca" title="Peter's Useful Crap">link to a fun site</a></p><p>You should come back often</p></blockquote>
... lots of different HTML after this
Into this:
... lots of different HTML before this
<blockquote>This is a quote with a <a href="http://www.theblog.ca" title="Peter's Useful Crap">link to a fun site</a></blockquote><blockquote>You should come back often</blockquote>
... lots of different HTML after this
This is not a particularly exciting or common task, but it comes with a simple tip. As you will see, below, my solution was to use the “e” directive on your preg_replace function so that you can run code on the match only. In my case, I had a simple str_replace:
<?php
function stripParagraphsFromBlockquotes( $text )
{
$find = '/<blockquote><p>(.*?)<\/p><\/blockquote>/ise';
$replace = "'<blockquote>' . stripslashes( str_replace( '</p><p>', '</blockquote>\n<blockquote>', '\\1' ) ) . \"</blockquote>\n\"";
$text = preg_replace( $find, $replace, $text );
return $text;
?>
At first, however, I did not have the outer stripslashes call, and was ending up with this output, which breaks XML parsers:
<blockquote>This is a quote with a <a href=\"http://www.theblog.ca\" title=\"Peter's Useful Crap\">link to a great fun</a></blockquote><blockquote>You should come back often</blockquote>
The tip is to use stripslashes on your inner “replace” code, since the “e” modifier automatically adds slashes on the preg_replace match ‘\\1′.