First scan results - people in the news††- keywords - Politics, Art, Culture, Technology, Entertainment - with a UK emphasis

Results of scanning RSS news feeds

1/ - people whose name appears 3 times or more during the last 5 weeks

- not a list of all-time most popular people - this time next year many of these names will not appear, or will be way down on the list††- new ones will be added

2/ - first names - a simple extracted total of the first word of a person's name

eg - all occurrences of Donald (as a first name) are totalled (421), likewise George (114) etc

# limitations - The results are for the whole name, not derivatives

So - Theresa May comes in at 83 - references to "Mrs May", "Prime Minister May", "May's" (eg - "May's policies") etc, are not included (yet)

# diacritics - All accents (eg the ť in Josť) have been removed and converted to their manual-typewriter equivalent (Jose) - the processes of parsing various sources of data and copying, editing and re-scanning introduce too many errors if the accents are left in - no offence intended

# errors - Take the fictitious name "Warren Loehnig" - now, I do not know whether this is a person or the name of a commercial company. My database is being compiled and refined over time as the results come in and start to become more meaningful. Many of the more obscure entries have been checked, but any sort of automated Google search gets blocked by their servers - no doubt shareholders will be relieved to know that bandwidth isn't being wasted on lallygag research

# parsing errors - As a rule, sentences begin with a capital letter no matter whether the word is a proper noun or not. The sentence "May you have a peaceful and prosperous New Year" does not reference Theresa May; "May will resign if Brexit negotiations fail" does - (and as for "May is the best month of the year" - well, [*~!~*])

# Results

# I have a list of useable verbs and their inflections - but it needs checking

# I have another list of verbs, this time a mix of phrasal verbs, idioms and multi-word verbs - but this too needs checking

# A thorough revision of words and their number of syllables and their stress patterns (iambic, trochaic etc) is under way - but, guess what, the biggest hurdle is checking the results
# discussion - especially the word "especially"
# how many syllables? - does anyone pronounce it es-pesh-i-al-ly(5) - how about es-pesh-al-ly(4) or es-pesh-ly(3)? - maybe even spesh-ly(2)?
# ism - iz-m or zum - humm
# discussion - stress
# noun and verb versions of the same word might be stressed differently
# you'll need a permit (1-0), if conditions permit (0-1) - give me my present(1-0), present(0-1) yourself, present(1-1) arms?
# regional variations - pronounciation and stress vary by area
# situational - for effect - a different stress for a different effect - as aforementioned present arms - pre sent (pause) arms (1-1, (pause) 1)

# I have a complete list of every word spelled in UK-English that has a US-English version (needs checking)

# the best way to check a list is to use it
# after the initial thrill of finding a list, eg 1.5 million English words, you then find you have to wade through it to get something useful
# considerations - a useable list is of more value than a complete list
# plough through...
# variant-spellings, commonly-mistyped words, archaic words, never-ever-used words, why-on-earth-write-this-word-down words, let's-see-how-many-prefixes-and-suffixes-we-can-add words, tracker-words (made-up words inserted so you can track when your work is being plagiarised)
# syndrome
# my list is bigger than yours
# modernity - iPlayer, repurposing, Brexit
# autocorrect, Autocorrect - trade names become common names
# dog sleigh becomes dog-sleigh becomes dogsleigh (one day)(maybe)
# and...
# commonality - all the lists I've found of how commonly a word is used are way out of date and, secondly, they don't indicate what part of speech has been analysed - eg "runs" - he runs a mile, she runs a company, score ten runs, get a bad dose of the runs, the river runs from x to y, the bus runs on time, he always runs out of milk, the runs in her tights irritated her, she always runs for Lord Mayor but never gets elected, when The Times runs a feature on your latest web-update you know you're in the big time, he runs guns from Russia to Britain

# proper nouns
# downloaded word lists all include a number of proper nouns but they're full of the names of people I've never heard of and places I didn't know existed - why is that? - especially when places and people I do know of are not listed - and these missing ones are words that I reckon most English-readers will be very familiar with
# so I've started work on making my own list of proper nouns - people, places, commercial companies and products, books and films, political and religious isms...
# which is why I started scanning news feeds
# what I know and what I learn

# lallygag is a variant of lollygag, which is a verb, but I've used it as an adjective, so there

