Regex: words containing alphabetic characters surrounded by non-alphabetic characters -


i'm noob regex , need solving this:

find , remove ocurrences of groups of 1 or 2 alphabetic characters, surrounded non-alphabetic characters. may encounter latin characters must treated alphabetic character. using php, pcre regex flavor.

for example:

remove:
a
aa 33 a3 3a
aa3 a3a 3aa 33a a33 3a3
aa3a a3aa 33a3 3a33 aa33 33aa a3a3 3a3a 3aa3 a33a
aa3aa 3aa3a a3a3a aa3a3 33a33 a33a3 3a3a3 33a3a a3a33
aa3aa3 a3a3a3 3a3a3a 33a33a

and on...

in cases "aa3aaa", regex need match aa3 part.

this got far:

(\b\d*?[a-z]{1,2}\d*?\b)|(\b(\d+?[a-z]{1,2}\d+?)+?\b)|(\b([a-z]{1,2}\d+?[a-z]{1,2})+?\b)|(\b(\d+?[a-z]{1,2})+?\b)|(\b([a-z]{1,2}\d+?)+?\b) 

img: https://www.debuggex.com/i/gkz0uhvvhoysmn81.png

regex @ debuggex

i cannot match words:
3l3l3
l3l3l

also cannot match partially word:
aa3aaa

any improving regex appreciated! thank much!

you didn't regex flavor use, here's way using pcre:

(?<!\p{l})\p{l}{1,2}(?!\p{l}) 

demo

this translates requirement in way:

  • groups of 1 or 2 alphabetic characters: \p{l}{1,2}
  • not preceded alphabetic character: (?<!\p{l})
  • not followed alphabetic character: (?!\p{l})

you can replace \p{l} [a-za-z] if flavor doesn't support unicode properties.


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

How do you convert a timestamp into a datetime in python with the correct timezone? -