Regex: words containing alphabetic characters surrounded by non-alphabetic characters -
i'm noob regex , need solving this:
find , remove ocurrences of groups of 1 or 2 alphabetic characters, surrounded non-alphabetic characters. may encounter latin characters must treated alphabetic character. using php, pcre regex flavor.
for example:
remove:
a
aa 33 a3 3a
aa3 a3a 3aa 33a a33 3a3
aa3a a3aa 33a3 3a33 aa33 33aa a3a3 3a3a 3aa3 a33a
aa3aa 3aa3a a3a3a aa3a3 33a33 a33a3 3a3a3 33a3a a3a33
aa3aa3 a3a3a3 3a3a3a 33a33a
and on...
in cases "aa3aaa", regex need match aa3 part.
this got far:
(\b\d*?[a-z]{1,2}\d*?\b)|(\b(\d+?[a-z]{1,2}\d+?)+?\b)|(\b([a-z]{1,2}\d+?[a-z]{1,2})+?\b)|(\b(\d+?[a-z]{1,2})+?\b)|(\b([a-z]{1,2}\d+?)+?\b)
img: https://www.debuggex.com/i/gkz0uhvvhoysmn81.png
i cannot match words:
3l3l3
l3l3l
also cannot match partially word:
aa3aaa
any improving regex appreciated! thank much!
you didn't regex flavor use, here's way using pcre:
(?<!\p{l})\p{l}{1,2}(?!\p{l})
this translates requirement in way:
- groups of 1 or 2 alphabetic characters:
\p{l}{1,2}
- not preceded alphabetic character:
(?<!\p{l})
- not followed alphabetic character:
(?!\p{l})
you can replace \p{l}
[a-za-z]
if flavor doesn't support unicode properties.
Comments
Post a Comment