Viewed   101 times

How can I explode the following string:

Lorem ipsum "dolor sit amet" consectetur "adipiscing elit" dolor

into

array("Lorem", "ipsum", "dolor sit amet", "consectetur", "adipiscing elit", "dolor")

So that the text in quotation is treated as a single word.

Here's what I have for now:

$mytext = "Lorem ipsum %22dolor sit amet%22 consectetur %22adipiscing elit%22 dolor"
$noquotes = str_replace("%22", "", $mytext");
$newarray = explode(" ", $noquotes);

but my code divides each word into an array. How do I make words inside quotation marks treated as one word?

 Answers

3

You could use a preg_match_all(...):

$text = 'Lorem ipsum "dolor sit amet" consectetur "adipiscing \"elit" dolor';
preg_match_all('/"(?:\\.|[^\\"])*"|S+/', $text, $matches);
print_r($matches);

which will produce:

Array
(
    [0] => Array
        (
            [0] => Lorem
            [1] => ipsum
            [2] => "dolor sit amet"
            [3] => consectetur
            [4] => "adipiscing "elit"
            [5] => dolor
        )

)

And as you can see, it also accounts for escaped quotes inside quoted strings.

EDIT

A short explanation:

"           # match the character '"'
(?:         # start non-capture group 1 
  \        #   match the character ''
  .         #   match any character except line breaks
  |         #   OR
  [^\"]    #   match any character except '' and '"'
)*          # end non-capture group 1 and repeat it zero or more times
"           # match the character '"'
|           # OR
S+         # match a non-whitespace character: [^s] and repeat it one or more times

And in case of matching %22 instead of double quotes, you'd do:

preg_match_all('/%22(?:\\.|(?!%22).)*%22|S+/', $text, $matches);
Tuesday, October 4, 2022
1

How can I avoid that? I want it to be check-out-the-1-place - so that there only is one hyphen between each word. Here is my code:

Whilst Mohammad's answer is nearly there, here is a more fully working PCRE regex method and explanation as to how it works, so you can use it as you need:

$str = trim(strtolower($pathname));
$newStr = preg_replace('/[s.,-]+/', '-', $str);

How this works:

  • Match a single character present in the list below [s.,-]+
    • + Quantifier Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
    • s matches any whitespace character (equal to [rntfv])
    • .,- matches a single character in the list .,- (case sensitive)
    • The dash - must come at the end of the [] set.

Results:

This: check out the 1. place

Becomes:

check-out-the-1-place

And

This: check out the - 1. place

Becomes

check-out-the-1-place


Further:

I would go further and assuming you are using this for a URL slug (a what?!); strip out all non-alphanumeric characters from the string and replace with a single - as per typical website slugs.

 $newStr = preg_replace('/[^a-z0-9]+/i', '-', $str);

How this works:

  • Match a single character NOT (^) present in the list below [a-z0-9]+
    • + Quantifier Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
    • a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
    • 0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
    • The i at the end indicates the judgements are case In-sensitive.

Example:

check out - the no.! 1. Place

Becomes:

check-out-the-1-Place

Tuesday, August 30, 2022
4

Implode, explode and array_slice.
I use array_slice because that makes the function more dynamic.

Now you can just set the $items to get the number of items you want.
If you set a negative value it counts backwards.

$delim = ",";
$items =2;
$text = "oh my god, thank you the lot of downvotes, geniuses *.*";
$whatiwant = implode($delim, array_slice(explode($delim, $text),0,$items));
Echo $whatiwant;

https://3v4l.org/KNSC4

You could also have an start variable to make the start position dynamic.

https://3v4l.org/XD0NV

Monday, August 29, 2022
1

Here's one simple way to do it. Split/explode your string using quotation marks. The first (0-index) element and each even-numbered index in the resulting array is unquoted text; the odd numbers are inside quotes. Example:

Test "testing 123" Test etc.
^0    ^1          ^2

Then, just replace the magic words (toys) with the replacement (cards) in only the even-numbered array elements.

Sample code:

function replace_not_quoted($needle, $replace, $haystack) {
    $arydata = explode('"', $haystack);

    $count = count($arydata);
    for($s = 0; $s < $count; $s+=2) {
        $arydata[$s] = preg_replace('~'.preg_quote($needle, '~').'~', $replace, $arydata[$s]);
    }
    return implode($arydata, '"');
}

$data = 'tony is playing with toys.
tony is playing with toys... "those toys that are not his" but they are "nice toys," those toys';

echo replace_not_quoted('toys', 'cards', $data);

So, here, the sample data is:

tony is playing with toys.
tony is playing with toys... "those toys that are not his" but they are "nice toys," those toys

The algorithm works as expected and produces:

tony is playing with cards.
tony is playing with cards... "those toys that are not his" but they are "nice toys," those cards
Monday, August 22, 2022
 
3

You can use a regular dict:

targetword = "good"
wordmap = {
    "best": targetword,
    "positive": targetword,
    "awesome": targetword,
    "fantastic": targetword
}
Saturday, August 20, 2022
 
dewtell
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :