The word list being checked against isn't quite the same as whatever Wordle is using for possible guesses (this is mentioned on the page); for example, "DUCKY" is a valid Wordle guess, but shown as invalid when entered into this tool.
The original Wordle word list was contained in plaintext in the game's javascript (which is what is used for this map), however I believe the current version used on NYT is using a different list. I don't know that the current set of guessable possibilities / the list of words possible to be chosen is known. I think some people have mentioned the list may have been expanded.
2. Words that are ever used as solutions in Wordle
I think this is a great design choice (and IIRC was part of the original game). That is, Wordle doesn't want to use really obscure or uncommon slang words as the solution, as that would make the game less enjoyable (I wouldn't like it so much if the solution was a word I'd never heard of). But, when I get stuck and can't think of a word that will fit as my next guess, sometimes I just go into "mash keyboard mode" so hopefully I'll find something that will fit and get more information about the solution. More often than not something fits, and I'm usually surprised my guess is an actual word.
So, to your point, DUCKY is a valid guess but will never be an actual solution in Wordle.
I think the design choice is key to the game, but for a different explanation. “Be liberal in what you accept” -> don’t reject a valid but obscure word as a guess, because people who know it’s a word will be rightly annoyed that the game is kneecapping their play.
NYT's Spelling Bee game is an example of the opposite case. It's a little bit like Wordle (you're given a set of letters and need to make words), but it doesn't accept obscure words. So I'm constantly coming up with words that fit the letters nicely and get rejected. It drains away a lot of the enjoyment.
I think the NYT's argument is that allowing obscure words drains excitement from those who don't know the word. They can't get a perfect score. So they have a choice of annoying you or annoying them and they chose to play to a broad base.
Squaredle [1] accepts many obscure words as bonus words that don’t count toward the canonical maximum word score but do serve as a tiebreaker for the leaderboard.
While I had kind of intuited this (I’ve developed my strategy over time to always favor the more common word when choosing among multiple possibilities), I didn’t realize some of the places where Wordle cuts its solution set. Specifically, I somehow never noticed that solutions are never s-suffixed plural, but you can clearly see it in the visualization if you try entering a word with four letters.
This is an important feature in fact, as otherwise the space of possible solutions grows to include nearly all 4-letter nouns (and verbs) +'s', and this is too large. I believe that regular pasts in -ed are also excluded for the same reason.
The iPhone app on the other hand includes -s plurals in its solution set, which makes for a markedly inferior game (it also has an error in its use of the yellow indication, which also impacts negatively)
> otherwise the space of possible solutions grows to include nearly all 4-letter nouns (and verbs) +'s', and this is too large. I believe that regular pasts in -ed are also excluded for the same reason.
Regular pasts in -ed would pretty much restrict you to four-letter verbs that are spelled with a silent E (kite ~ kited); that will be a much smaller class of words than the ones that get excluded by the "s" suffixes.
There is a weird tangential observation we can make here, which is that the plural suffix -s, the verbal suffix -s, and the verbal suffix -t all behave exactly the same way - they are ordinarily zero syllables matching the voicing of the consonant that precedes them, but when that preceding consonant is similar to the consonant of the suffix, they grow an epenthetic vowel (becoming one syllable instead of zero) and become voiced - but the -s suffixes are only spelled with their epenthetic vowel when it exists (compare mats versus masses), while the -t suffix is spelled with its vowel whether the vowel exists or not (compare mopped versus modded).
Wordle seems to favor base words generally but it doesn't seem to be as rigid about it as with plurals ending in an "s" (which I don't think I've ever seen one of).
ADDED: Looking at the visualization tool there are a few past tenses. A couple of them seem to be of the form like PLIED where singular is actually 3-letter word ending in Y.
First I'd heard of it. Although looking back, I've tested lots of plurals and can't recall one ever being right. Sort of degrades the game to have an unwritten rule like that.
There are actually 3 lists if you count the Wordlebot 2.0 [1] list:
1. The words that are solutions in the current game
2. An approximately 2x size list that is the space Wordlebot uses to "play" in. Supposedly it's still common words though some seem pretty weird to even a native English speaker (and some British slang leaks through as well). For example, titer was one of Wordlebot's recommended guesses today which I correctly sort of assumed had something to do with titration but would probably never have guessed it.
3. A larger pool of words that Wordle will let you enter as guesses
what you say is somewhat correct, but this site says that these are the valid answers, not including the valid guesses, so DUCKY wouldn't be expected here.
personally, I'd prefer a tool that included all balid guesses and wouldn't lead me so precisely to answers. there's small c cheating and big C Cheating :)
At the risk of sounding like your fourth grade teacher, you’re only cheating yourself.
I would argue that using a dictionary is a tactic and one to better help you better play the game by introducing you to the idea of a decision tree. You can ask Wordlebot how you did and see how it approaches the problem with all the knowledge short of knowing the actual word.
I clicked on the 'z's and noticed the same for "WHIZZ"
I guess the article does say:
> Note that these words only represent the list Wordle uses for possible solutions; there are many more guessable 5-letter words.
I still play Wordle daily with my folks. Gets the juices flowing and gets daily conversations started.
Anyway, HORSE was used like 3 days apart. I was shocked, and annoyed.
See, if I were to write Wordle, which arguably I did too (see http://curdle.me) but I digress, I'd make an array of all words, randomize it, store that, match one per day and call it done. Years worth of fun.
But that doesn't appear to be what NYT is doing, they pick a random word every day? Which means that my evil plan of starting every single day with the same word might never get a N1 or I might get it more than once in a year!
You are correct, I think what has happened is I confused WORSE and HORSE and must have used HORSE when it was incorrect but remembered it as the solution. Thanks for the correction!
If the author is reading: zero probability and near zero have indistinguishable colors.
That should be considered an error. Any letter that isn't impossible should be clearly labelled as possible. Right now you get lines pointing to semingly black letters, which looks like a mistake in the visualization.
Is the site inspired by FiveThirtyEight or Nate Silver in general? Very nifty, highly performant, and takes me back to when FiveThirtyEight was just a small stats blog.
Thank you! My friend and I started this project pretty early on during COVID. FiveThirtyEight was definitely a design influence, along with pudding.cool, some NYT pieces, and others. There are some things on the data side that FiveThirtyEight does sports which I'm not a fan of, like their emphasis on "all in one" metrics to measure performance, but overall they definitely do a great job with their presentation.
I know everyone has their "Starting Word" by this point, but this graphic seems very good at discovering good starting words.
"SLATE" seems to have a huge number of solutions on each letter: 300+ words available on S----, 400+ words available on ----E. The smallest is -L----, which has 200 words, still a good size all else considered.
I'm not too sure on how to evaluate the "best starting word", but maybe I'll try "SLATE" next time I play Wordle.
EDIT: Some people are interested in "two starting words". Seems like "ROBIN" hits a lot of words that SLATE misses. -O--- in the 2nd letter was the most common letter+position after the letters S L A T and E were banned. "CHILD" also seems like another good followup word after SLATE.
Some MIT researchers actually had found that SALET performed 1% better as a starting word. [1] However NYT recently updated the permissible word list, and TARSE is currently the statistically best starting word.
From this I concluded the best starting word is FEAST. Each column has a high likelihood of being correct. And damn if I didn’t get today’s Wordle on the second guess with this strategy. Anecdotal data FTW!
I also think I've found that mixing things up because your usual word did really well the day before doesn't really seem to be a good strategy because if the solution is a lot different, just eliminating some of the most common letters is still pretty effective.
Is there some tendency for a starting word that does well one day to do poorly the next day? I know Original Wordle was randomized so there wouldn't be, but now that there's editing there might be.
I don't know. I'll sometimes mix things up a bit if, say, I get 4 letters out of SLATE on a day. (I sometimes will use CRANE which is still very high on the starting words list.) At the same time though, even if SLATE were to come up empty the next day, that's a lot of negative information.
Since we're talking about wordle dictionary there, I find it relevant to plug this in - the best words to solve wordle https://www.fev.al/posts/wordle/
Cool! I love this kind of intuitive exploratory visualizations.
I wonder how it could be done for my 5x5 letter wordle variant https://squareword.org. Technically you would need 25 small alphabets, one for each letter. But that seems tricky to fit on a screen in such a visually pleasing way. Will have to look into if there would be a better way :)
Some advice: you may want to improve the instructions of the game. There's no clear indication of what to do when the first 5-letter word has been typed - it merely bumps the word if you type more letters, or shows a cryptic message "Type your guess on the keyboard to begin!" if the square is clicked.
As for visualizations, you may want to take a look at mathematical relations [1]. Each cell is a variable with the alphabet as its domain (which is pruned as you guess correct values), and the words are 5-arity relations between rows and columns. You may represent the local relations surrounding the cell pointed by the mouse.
It looks beautiful, but am I the only one who is irked by the fact that the lines cannot be joined at will?
For example, if I type PO_N_ there are two allowed words (POINT and POUND) but this particular visualization does not distinguish them from POIND and POUNT.
Do you play Wordle? I actually thought this was interesting, since it’s something I’ve often wondered while playing it. “The first letter is U and the fourth is E, how big the set of possible answers is now?”
Maker of the visualization here! Curious how you would use a plaintext list of 2309 words to get at some of the questions I mentioned on the tool, like "how many words use Y as the only vowel?".
Tangent to the Wordle tool posted. Is PerThirtySix a reference to FiveThirtyEight? The latter must be one of the most well-known brands in the sports analytics space. Or is 36 an important number in sports?
Yes. NBA games are 48 minutes in regulation. No one plays the full 48 minutes, but a lot of players (starters mainly) do play 36 minutes. So, 36 minutes has become a common time period of production time used for analytics in the NBA.
There are a ton of variants out there including more letters and more words and I'm sure there's Javascript you can download whether or not you still can from the Times.
I have to say I played around with a number of these at one point but I really ended up coming back to just playing Wordle once a day. The multi-word ones in particular really force a different playing style.
Very nice. I think the NYT has been tweaking the word list a bit. I know they've removed a few words, but don't know if they've added any. I don't know how much the list has drifted over time.
The original Wordle word list was contained in plaintext in the game's javascript (which is what is used for this map), however I believe the current version used on NYT is using a different list. I don't know that the current set of guessable possibilities / the list of words possible to be chosen is known. I think some people have mentioned the list may have been expanded.