I am reading words from an image(licence) and I know that every word on this licence is uppercase so…If it picks up background noise and returns lowercase letters alone or in words I know that that is not useful/or incorrect information.
From a licence this data is returned….
aaaa——————————————aESESESsSs—— VICTORIA AUSTRALIA JANE CITIZEN 87652301 FLAT 10 " 77 SAMPLE-PARADE . ‘ KEW-EAST VIC 3102 .\ e ol LICENCE EXPIRY DATE OF BIRTH 20-05-2019 29-07-1983 \ ' ) EICENCE TYRE ‘CONDITIONS Alh 7 al CAR A\ SBEAXYZ 28071985 SN |_vicroads | =< AN e
One of my points of sanitising this into useful info is to remove all words and letters with lowercase characters. Will I need to split by space, then iterate through each word and remove if it finds lowercase or is there a regex pattern I can use?
I tried this text = text.replace(/[^A-Z0-9 \n]/g, '')
but would like to also remove full words with lowercase letters as oppose to just all lowercase letters by themselves.
Thanks
Use
See proof.
\W*
\w*
[a-z]
\w*
You can optionally match non whitspace chars without a lowercase char. Then match a lowercase char followed by optional non whitespace chars.
The pattern matches:
[^\sa-z]*
optionally match a non whitespace char except a-z[a-z]
Match a single lowercase char a-z\S*
Match optional non whitespace charsRegex demo