Words | ru |
The program searches for words in one or two dictionaries using one of the search options. English dictionary contains about 134,000 words, Russian about 445,000 words. In addition one can view dictionary statistics, frequency of word length (descend), two nearby characters distribution, letters frequency. You can also run a partial version of the program online, where the search is implemented via cgi and php.
Search anagrams
Anagram is a word, the result of rearranging the letters of it, produce a another word. For example, calligraphy - graphically. The longest anagrams are:
pathophysiological physiopathological 18 characters |
crystallographica crystalographical 17 characters |
microphotographic photomicrographic 17 characters |
autoradiographic radioautographic 16 characters |
micromillimeters micromillimetres 16 characters |
Search pangrams
Pangrams are words which has as many as possible different alphabet characters. For example word superacknowledgement has 16 different characters.
Search using pattern
Search using pattern. Words include all of the letters or part of the letters or consist of only letters from pattern. Search options
- words contain all of the pattern letters - words have all of the letters from pattern and any number of additional letters.
- words contain part of the pattern letters - words contain part of the letters from pattern and no one another characters.
- words contain part of the pattern letters (ordered) - words contain a part of letters from the template and no other letters, and the order of letters in the found word must be the same as in the template. For example, if you remove some of the letters in the pattern of lamebrained, you can get the word ameba
- words consist of only pattern letters - any letter from word should be one of the pattern letter.
Search palindromes
Search palindromes. Palindrome is a word, that reads the same backwards as forwards. For example rotator.
reviver 7 characters |
rotator 7 characters |
hallah 6 characters |
hannah 6 characters |
mallam 6 characters |
Search words for crossword
Search words for crossword. For template m***m program finds the words madam, modem etc.
Search regular expressions
Search regular expressions with given number of matches. Regular expression is a sequence of symbols and characters expressing a string or pattern to be searched for within a longer piece of text. Regular expressions are very powerful tool for searching. For more information you can make internet search with request string regular expressions of perl. Examples
- By string we$ program finds the words which ends with we.
- (.)(.*\1){6,} - search any character which occurence seven or more times. Found word stresslessness (letter s appears 7 times).
- (..)(.*\1){2,} - search any two characters which occurence three or more times. Found word confrontational (letters on appears 3 times).
Search modifications of words.
There are three types of modifications: replaces, inserts and substrings. Also it's possible to set several modifications.
REPLACES. Syntax string1>string2.
String ab>cde means replace all occurences of ab to cde. String k>0 means replace all characters k to empty string.
INSERTS/CONCATENATIONS. Syntax +index,string.
String +0,b add character b to the beginning (before first symbol of word). String +2,we inserts we before third symbol of word. String +L,y add y to the end. String +L-1,s inserts s before last symbol of word.
SUBSTRINGS. Syntax -index,length.
String -0,3 leaves first three chars of word. String -4,3 leaves three chars from fifth char. If second parameter is not set (eg. -3), then left all chars till the end (from fourth symbol). String -L-5,3 leaves three letters from fifth char from the end of the word. String -L-5 leaves last five chars. Other string examples -4,L-5 -L-4,L-5.
Not all of replaces change word. If option every modification changes word is on, then every single operation should change word.
Search chains of words
Search chains of words such as only one character changes for every step. Consider two words woman and chick. There is a chain woman women woven coven coves cores corns coins chins chink chick. Program searches all chains with shortest length.
Search characters sequence
Search characters sequence. For sequence mata program finds next words matador, razzmatazz and so on.
Searching for letter group splits
Splitting a group of letters into two or three words. First all splits of a group of letters into two words are searched, then all splits into three words are searched. For example, the letter group anticodonpall can be split into the two words capital and london or into the three words all, pant, conoid.
Search simple sequence words
Search two sets of words such that some number of last characters for any word of first set are first characters of any word from second set. Consider two sets. The first one consists of two words animated and exanimated. The second one also has two words animated and animatedly. Last eight letters of any word animated or exanimated, agree with first eight letters of words animated and animatedly.
Note. Word animated has eight characters and first eight characters are the same with last eight characters. So any word is a sequence of itself. If there are no words found which starts from animated or ended by animated then set is not consider as trivial.
Search double sequence words
Search two sets of words such that some number of last characters for any word of first set are first characters of any word from second set and vice versa. Consider two sets. The first one consists of two words bleeder and blender. The second one also has two words derisible and derivable. Last three letters of any word bleeder or blender, agree with first three letters of words derisible and derivable. And at the same time last three characters of derisible and derisible are first characters of any word bleeder or blender.
Note. As in the case of searching for sequence words trivial sets are excluded.
Search full word sequences
Full sequence words are pair of words. If read the first word backward one got the second word and vice versa. For example, kramer - remark. Palindromes are ignored.
desserts stressed |
lattimer remittal |
deliver reviled |
dessert tressed |
animal lamina |
Search keyboard words (one row)
All characters of the word should be on one horizontal or vertical keyboard row.
peppertree 10 characters |
pepperwort 10 characters |
perpetuity 10 characters |
proprietor 10 characters |
repertoire 10 characters |
Search keyboard words (row and diagonals)
All neighbour characters of the word should be adjacent keys (horizontally, vertically or diagonally) on keyboard.
redressed 9 characters |
redresser 9 characters |
redresses 9 characters |
assessed 8 characters |
assesses 8 characters |
Search consonants or vowels sequences
Search sequences of given number of consonant or vowel characters. The words which have six consonants in a row are knightsbridge, festschrift, goldschmidt, latchstring, sightscreen, weltschmerz, watchstrap. The words which have five vowels in a row are liaoyang, queueing, iyeyasu, taiyuan.
Search words with low density of consonants or vowels
Search words with low density of consonants or vowels.
Vowel characters.
strengths | vowels 11% | 9 characters |
mcknight | vowels 12% | 8 characters |
schmaltz | vowels 12% | 8 characters |
schnapps | vowels 12% | 8 characters |
schwartz | vowels 12% | 8 characters |
Consonant characters.
iyeyasu | consonants 14% | 7 characters |
euboea | consonants 16% | 6 characters |
eyetie | consonants 16% | 6 characters |
ieyasu | consonants 16% | 6 characters |
ukiyoe | consonants 16% | 6 characters |
Searching words from two dictionaries
Matched words (simple)
Search words which write in english the same as in russian. For example, Beep (english word) - веер (russian word). Among them were found words that are not only written the same, but also have the same meaning in both languages: tokamak, Tokmak, Mekka, totem, atom, mama, kama (a Hinduism concept and name of a river). Interesting pairs: Bumble витые, HoBble новые.
сотрете compete | 7 characters |
сотрите compute | 7 characters |
токамак tokamak | 7 characters |
гаснет rachet | 6 characters |
довьет gobbet | 6 characters |
кагате karate | 6 characters |
Matched words (transliteration)
Search words which write in english the same as in transliterate russian. For example, administrator (english word) - администратор (russian word).
трансценденталист transcendentalist | 17 characters |
днепродзержинск dneprodzerzhinsk | 15 characters |
инструменталист instrumentalist | 15 characters |
мультипроцессор multiprocessor | 15 characters |
антидепрессант antidepressant | 14 characters |
антимилитарист antimilitarist | 14 characters |
днепропетровск dnepropetrovsk | 14 characters |
министериалист ministerialist | 14 characters |
Search keyboard words
Search pairs of words which after keyboard typing with invalid keyboard locale give valid word on another language. For example, entity (english word) - утешен (russian word).
entity утешен | 6 characters |
erect укусе | 5 characters |
ghent пруте | 5 characters |
inert штуке | 5 characters |
abut фиге | 4 characters |
Additions
Dictionary statistics
Dictionary statistics includes
- characters frequency at the beginning of the word
- characters frequency at the end of the word
- characters frequency at any place of the word
- characters frequency on keyboard
- total amount of words
- average word length
- the longest word
Frequency of word length (descend)
Program shows frequency of words with given length
|
|
It seems that word length is normal distribution random variable. On the pictures below one can see word length frequency data and normal distribution approximation.
English dictionary. Expected value - 8.9546. Dispersion - 8.2675. |
Russian dictionary. Expected value - 9.6727. Dispersion - 9.2424. |
Two nearby characters distribution
Program shows two nearby characters distribution in the words of dictionary, sorted in descending order.
Letters frequency
If one sorts letter by frequency in descending order and create graph then it seems that logarithm of frequency will be a linear function.
Versions history
version 4.43 28 march 2024
added search for words containing part of letters from the template (ordered)
added distribution of the sequence of two letters at the beginning/end of a word
filter works for all search/statistics options
sort working for chains of words option
project recompliled under new version of gcc and gtk libraries (gtk library bug with text alignment is fixed)
removed/added several words from russian dictionary, added word tokmak in english dictionary
fixed bug with end time
added using of custom cgi library in cgi mode
added deleting/adding words to dictionaries
image height in about dialog is counted from content
version 4.42 17 february 2022
fixed bugs if NDEBUG not defined, now compiles normally
remove incorrect word from russian dictionary
version 4.41 31 december 2021
version 4.4 3 november 2021
add positive/negative regex filter of results
sources added to github
project uses aslov helper library
add language dependent thousand separator symbol
version 4.3 3 april 2018
add word chain search (c++ only)
add check new version
fixed bug(?) for threads synchronization
fixed bug when user select search matched words using two dictionaries (simple and translit) when selected english dictionary (c++ only)
fixed bug when user select search using two dictionaries and sorting by percent of vowels/consonants when selected russian dictionary (c++ only)
fixed bug for online version (search image not drawing correctly under chrome browser). Image encoded using photoshop
code is adopted for new gtk version 3.22.28
added russain words from spellchecker webpage libreoffice.org 254,740 new words
added russain words from Lopatin (892 new words) and Ozhegov (259 new words) dictionaries
russian words from Dal dictionary was not added because there are too many old words
version 4.2 3 december 2017
add online version of program using cgi (cgi_words and gtk_words projects use common files)
add online version of program using php scripts
code is adopted for unix systems
search concatenations/substrings is change to more general modifications search
change search of simple/double sequence words. The same words now are omitted if only one word in both sets is found
search strings now case insensitive for all options
search in found words also case insensitive
alphabet sorting added for two nearby characters distribution option if frequencies are the same
accelerate search keyboard words in two dictionaries
improve thread synchronization
version 4.1 22 december 2016
fix bug with regular expressions and some other searches
source code is adopted for new gcc 5.4.0 & gtk 3.20.6 versions
accelerate regular expressions search
add search of concatenations/substrings
fix bug when user press accelerator key and text entry is active
correct two words in russian dictionary
version 4.0 1 october 2015
sort and find text options is added
search of pangrams is added
search of double sequence words is added
version 3.9 1 september 2014
version 3.8 ** march 2014
add dictionary unit function
version 3.7 ** march 2013
status bar added and time of last operation
regular expressions search added
two nearby characters distribution added
adding support of multilanguage interface
search of variants of words removed