I'm looking to replace all instances of spaces in urls with %20. How would I do that with regex?
Thank you!
I'm looking to replace all instances of spaces in urls with %20. How would I do that with regex?
Thank you!
There's no need to use a regex for this. PHP has an inbuilt function to do just this. Use parse_url()
:
$domain = parse_url($url, PHP_URL_HOST);
Here is yet another way to do this and preserve your spaces:
$my_string = "http://apinmo.com/1/2/3.jpg 4444/8888/7777 http://apinmo.com/4/5/8-1.jpg";
$string_array = explode(' ', $my_string);
print_r($string_array); // for testing
$new_array = '';
foreach($string_array AS $original) {
$pos = strpos($original, 'http');
if(0 === $pos){
$new = preg_replace('/(?<=d)/(?=d)/', '', $original);
$new_array[] = $new;
} else {
$new_array[] = $original;
}
}
$new_string = implode(' ', $new_array);
echo $new_string;
Returns (note the preserved spaces):
http://apinmo.com/123.jpg 4444/8888/7777 http://apinmo.com/458-1.jpg
EDIT - Pure regex method:
$new_string = preg_replace('/(?<=/d)(/)/', '', $my_string);
echo $new_string;
Returns: http://apinmo.com/123.jpg 4444/8888/7777 http://apinmo.com/458-1.jpg
CAVEATS:
a. ) works even if there are no spaces in the string
2. ) does not work if any number between /
is more than one digit in length.
iii. ) if the second group of digits is like 4444/5/8888
the second slash would get removed here too.
Here is how the regex breaks down:
Using a positive lookbehind to match a /
followed by a digit (?<=/d)
I can assert what I am looking for - I only want to remove the forward slashes after a forward slash followed by a digit. Therefore I can capture the other forward slashes with (/)
immediately after the lookbehind. There is no need to include http://
to start or .jpg
to close out.
You can use this regex:
#(s|^)((?:https?://)?w+(?:.w+)+(?<=.(net|org|edu|com))(?:/[^s]*|))(?=s|b)#is
Code:
$arr = array(
'http://www.domain.com/?foo=bar',
'http://www.that"sallfolks.com',
'This is really cool site: https://www.domain.net/ isn't it?',
'http://subdomain.domain.org',
'www.domain.com/folder',
'Hello! You can visit vertigofx.com/mysite/rocks for some awesome pictures, or just go to vertigofx.com by itself',
'subdomain.domain.net',
'subdomain.domain.edu/folder/subfolder',
'Hello! Check out my site at domain.net!',
'welcome.to.computers',
'Hello.Come visit oursite.com!',
'foo.bar',
'domain.com/folder',
);
foreach($arr as $url) {
$link = preg_replace_callback('#(s|^)((?:https?://)?w+(?:.w+)+(?<=.(net|org|edu|com))(?:/[^s]*|))(?=s|b)#is',
create_function('$m', 'if (!preg_match("#^(https?://)#", $m[2]))
return $m[1]."<a href="http://".$m[2]."">".$m[2]."</a>"; else return $m[1]."<a href="".$m[2]."">".$m[2]."</a>";'),
$url);
echo $link . "n";
OUTPUT:
<a href="http://www.domain.com/?foo=bar">http://www.domain.com/?foo=bar</a>
http://www.that"sallfolks.com
This is really cool site: <a href="https://www.domain.net">https://www.domain.net</a>/ isn't it?
<a href="http://subdomain.domain.org">http://subdomain.domain.org</a>
<a href="http://www.domain.com/folder">www.domain.com/folder</a>
Hello! You can visit <a href="http://vertigofx.com/mysite/rocks">vertigofx.com/mysite/rocks</a> for some awesome pictures, or just go to <a href="http://vertigofx.com">vertigofx.com</a> by itself
<a href="http://subdomain.domain.net">subdomain.domain.net</a>
<a href="http://subdomain.domain.edu/folder/subfolder">subdomain.domain.edu/folder/subfolder</a>
Hello! Check out my site at <a href="http://domain.net">domain.net</a>!
welcome.to.computers
Hello.Come visit <a href="http://oursite.com">oursite.com</a>!
foo.bar
<a href="http://domain.com/folder">domain.com/folder</a>
PS: This regex only supports http and https scheme in URL. So eg: if you want to support ftp also then you need to modify the regex a little.
The newline could still be registering if you've got the "rn" version of a newline - you're removing "n" and leaving "r" behind.
$str = preg_replace('/s+/',',',str_replace(array("rn","r","n"),' ',trim($str)));
Cleaner (more legible) version:
$str = preg_replace('#s+#',',',trim($str));
No need for a regex here, if you just want to replace a piece of string by another: using
str_replace()
should be more than enough :But, if you want a bit more than that, and you probably do, if you are working with URLs, you should take a look at the
urlencode()
function.