Viewed   106 times

To match a literal backslash, many people and the PHP manual say: Always triple escape it, like this \\

Note:

Single and double quoted PHP strings have special meaning of backslash. Thus if has to be matched with a regular expression \, then "\\" or '\\' must be used in PHP code.

Here is an example string: test

$test = "\test"; // outputs test;

// WON'T WORK: pattern in double-quotes double-escaped backslash
#echo preg_replace("~\t~", '', $test); #output -> test

// WORKS: pattern in double-quotes with triple-escaped backslash
#echo preg_replace("~\\t~", '', $test); #output -> est

// WORKS: pattern in single-quotes with double-escaped backslash
#echo preg_replace('~\t~', '', $test); #output -> est

// WORKS: pattern in double-quotes with double-escaped backslash inside a character class
#echo preg_replace("~[\]t~", '', $test); #output -> est

// WORKS: pattern in single-quotes with double-escaped backslash inside a character class
#echo preg_replace('~[\]t~', '', $test); #output -> est

Conclusion:

  • If the pattern is single-quoted, a backslash has to be double-escaped \ to match a literal
  • If the pattern is double-quoted, it depends whether the backlash is inside a character-class where it must be at least double-escaped \ outside a character-class it has to be triple-escaped \\

Who can show me a difference, where a double-escaped backslash in a single-quoted pattern e.g. '~\~' would match anything different than a triple-escaped backslash in a double-quoted pattern e.g. "~\\~" or fail.

When/why/in what scenario would it be wrong to use a double-escaped in a single-quoted pattern e.g. '~\~' for matching a literal backslash?

If there's no answer to this question, I would continue to always use a double-escaped backslash \ in a single-quoted PHP regex pattern to match a literal because there's possibly nothing wrong with it.

 Answers

2

A backslash character () is considered to be an escape character by both PHP's parser and the regular expression engine (PCRE). If you write a single backslash character, it will be considered as an escape character by PHP parser. If you write two backslashes, it will be interpreted as a literal backslash by PHP's parser. But when used in a regular expression, the regular expression engine picks it up as an escape character. To avoid this, you need to write four backslash characters, depending upon how you quote the pattern.

To understand the difference between the two types of quoting patterns, consider the following two var_dump() statements:

var_dump('~\~');
var_dump("~\\~");

Output:

string(4) "~\~"
string(4) "~\~"

The escape sequence ~ has no special meaning in PHP when it's used in a single-quoted string. Three backslashes do also work because the PHP parser doesn't know about the escape sequence ~. So \ will become but ~ will remain as ~.

Which one should you use:

For clarity, I'd always use ~\\~ when I want to match a literal backslash. The other one works too, but I think ~\\~ is more clear.

Thursday, October 27, 2022
4

Assuming you want to remove both ( and ) from the $search string:

$search = preg_replace('/(|)/','',$search);

I think the fastest way to do this is using the strtr function, like this:

$search = strtr($search, array('(' => '', ')' => ''));
Tuesday, November 15, 2022
 
2

PHP strings can be specified not just in two ways, but in four ways.

  1. Single quoted strings will display things almost completely "as is." Variables and most escape sequences will not be interpreted. The exception is that to display a literal single quote, you can escape it with a back slash ', and to display a back slash, you can escape it with another backslash \ (So yes, even single quoted strings are parsed).
  2. Double quote strings will display a host of escaped characters (including some regexes), and variables in the strings will be evaluated. An important point here is that you can use curly braces to isolate the name of the variable you want evaluated. For example let's say you have the variable $type and you want to echo "The $types are". That will look for the variable $types. To get around this use echo "The {$type}s are" You can put the left brace before or after the dollar sign. Take a look at string parsing to see how to use array variables and such.
  3. Heredoc string syntax works like double quoted strings. It starts with <<<. After this operator, an identifier is provided, then a newline. The string itself follows, and then the same identifier again to close the quotation. You don't need to escape quotes in this syntax.
  4. Nowdoc (since PHP 5.3.0) string syntax works essentially like single quoted strings. The difference is that not even single quotes or backslashes have to be escaped. A nowdoc is identified with the same <<< sequence used for heredocs, but the identifier which follows is enclosed in single quotes, e.g. <<<'EOT'. No parsing is done in nowdoc.

Notes: Single quotes inside of single quotes and double quotes inside of double quotes must be escaped:

$string = 'He said "What's up?"';
$string = "He said "What's up?"";

Speed:
I would not put too much weight on single quotes being faster than double quotes. They probably are faster in certain situations. Here's an article explaining one manner in which single and double quotes are essentially equally fast since PHP 4.3 (Useless Optimizations toward the bottom, section C). Also, this benchmarks page has a single vs double quote comparison. Most of the comparisons are the same. There is one comparison where double quotes are slower than single quotes.

Sunday, December 25, 2022
 
dariaa
 
2

preg_replace('/[_]+/', '_', $your_string);

Friday, December 9, 2022
 
kalpesh
 
4

A string literal is:

  1. An open single-quote, followed by:
  2. Any number of doubled-single-quotes and non-single-quotes, then
  3. A close single quote.

Thus, our regex is:

r"'(''|[^'])*'"
Wednesday, September 14, 2022
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :