Viewed   68 times

I have some strings which can be in the following format

sometext moretext 01 text
text sometext moretext 002
text text 1 (somemoretext)
etc

I want to split these strings into following: text before the number and the number

For example: text text 1 (somemoretext)
When split will output:
text = text text
number = 1

Anything after the number can be discarded

Have read up about using regular expressions and maybe using preg_match or preg_split but am lost when it comes to the regular expression part

 Answers

1
preg_match('/[^d]+/', $string, $textMatch);
preg_match('/d+/', $string, $numMatch);

$text = $textMatch[0];
$num = $numMatch[0];

Alternatively, you can use preg_match_all with capture groups to do it all in one shot:

preg_match_all('/^([^d]+)(d+)/', $string, $match);

$text = $match[1][0];
$num = $match[2][0];
Tuesday, October 4, 2022
2

By using [^0-9]+ you are actually matching the numbers and splitting on them, which leaves you with an empty array element instead of the expected result. You can use a workaround to do this.

print_r(preg_split('/d+K/', '12345hello'));
# Array ([0] => 12345 [1] => hello)

The K verb tells the engine to drop whatever it has matched so far from the match to be returned.

If you want to consistently do this with larger text, you need multiple lookarounds.

print_r(preg_split('/(?<=D)(?=d)|d+K/', '12345hello6789foo123bar'));
# Array ([0] => 12345 [1] => hello [2] => 6789 [3] => foo [4] => 123 [5] => bar)
Friday, September 30, 2022
 
chendo
 
1

You can tell preg_split() to split at any point in the string which is followed by three digits by using a lookahead assertion.

$str = "101WE3P-1An Electrically-Small104TU5A-3,Signal-Interference Duplexers Gomez-GarciaRobertoTU5A-3-01";
$result = preg_split('/(?=d{3})/', $str, -1, PREG_SPLIT_NO_EMPTY);

var_export($result);

Gives the following array:

array (
  0 => '101WE3P-1An Electrically-Small',
  1 => '104TU5A-3,Signal-Interference Duplexers Gomez-GarciaRobertoTU5A-3-01',
)

The PREG_SPLIT_NO_EMPTY flag is used because the very start of the string is also a point where there are three digits, so an empty split happens here. We could alter the regex to not split at the very start of the string but that would make it a little more difficult to understand at-a-glance, whereas the flag is very clear.

Thursday, September 15, 2022
 
1

I would approach this by using re.match in the following way:

import re
match = re.match(r"([a-z]+)([0-9]+)", 'foofo21', re.I)
if match:
    items = match.groups()
print(items)
>> ("foofo", "21")
Saturday, October 22, 2022
 
ffl
 
ffl
2

Something like this should do the trick, to match what is between ( and ) :

$str = "blah blah blah (here is the bit I'd like to extract)";
if (preg_match('/(([^)]+))/', $str, $matches)) {
    var_dump($matches[1]);
}

And you'd get :

string 'here is the bit I'd like to extract' (length=35)


Basically, the pattern I used searches for :

  • An opening ( ; but as ( has a special meaning, it has to be escaped : (
  • One or more characters that are not a closing parenthesis : [^)]+
    • This being captured, so we can use it later : ([^)]+)
    • And this first (and only, here) captured thing will be available as $matches[1]
  • A closing ) ; here, too, it's a special character that has to be escaped : )
Wednesday, November 16, 2022
 
Only authorized users can answer the search term. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :