Title: Artificial Intelligence CIS 342
1Artificial IntelligenceCIS 342
- The College of Saint Rose
- David Goldschmidt, Ph.D.
April 22, 2008
2Crossword Puzzle Construction
- Given
- Dictionary of valid wordsand phrases
- Empty crossword grid
- Problem
- Fill the crossword grid suchthat all words both
acrossand down are valid - Assign clues
3Crossword Puzzle Construction
- Depth-First Search (DFS)
- Fill in words until a solution is foundor a
dead-end is encountered - Backtrack from dead-ends
- Questions
- Where do we start?
- What word do we fill in next?
- What backtracking strategies do we use?
- How do we avoid repetition (boring puzzles)?
4Crossword Puzzle Construction
- Optimize the DFS
- Add longer (most constrained) words first
- Associate weights with words in dictionarybased
on frequency of letters - Friendly crossword puzzle wordsinclude letters
S, R, E, T, D, A, I, L - Unfriendly crossword puzzle wordsinclude
letters J, Q, X, Z, F, V, W - e.g. quiz, fix, jazz, quaff, xylophone, wax
5Crossword Puzzle Construction
- Genetic Algorithm (GA)
- Evolve a solution by crossovers andmutations
through many generations - Initial population of crossword grids
- Random letters?
- Random letters based on Scrabble frequencies?
- Random words from dictionary?
- Fitness of each grid is number of valid words
6Solving Crossword Puzzles
- Given
- Crossword grid
- Clues
- Problem
- Fill the grid such that all words correctly
answerthe given clues
7Solving Crossword Puzzles
- Obtain candidate answers for each clue
- Assign a confidence value to each candidate
- Are we guaranteed to have the correct answer?
- Place candidate answers in grid until a
solutionis found or a dead-end occurs - Which backtracking strategiesshould we use?
8Solving Crossword Puzzles
- PROVERB Duke University, 1999
- Modules provide candidate answersfrom
dictionaries, encyclopedias,movie databases,
etc. - Module sources a Crossword Puzzle Database
ofexactly 5142 previously solved puzzles - Pivotal in PROVERBs success
- Another module generates all combinationsof
letters (ouch!)
9Solving Crossword Puzzles
- Google CruciVerbalist (GCV)
10Solving Crossword Puzzles
- GCV solved 13x13 puzzle with 68 clues
- Many clues are fill-in-the-blank or pop-culture
clues - Candidate answersobtained from Googleresults
page (top 50) - Solved using 559 Google queries
- Queries yielded 68 correct answers
- 44 correct answers had highest confidence
11Solving Crossword Puzzles
12Clue Preprocessing
- Categorize clues based on text and type of clues
- Fill-in-the-blank clues
- Synonyms/Antonyms
- Type of (or Kind of) clues
- Abbreviations
- Clues with and or or
- Singular or plural
- Number of words in answer
13Clue Preprocessing
- Translate clues to Google-friendly forms
- To ___ is human
- To is human
- To is human
- Mary ___ little lamb (2 words)
- Mary little lamb
- ___ to Joy by Beethoven
- to Joy by Beethoven
- to Joy by Beethoven
14Clue Preprocessing
- Translate clues to Google-friendly forms
- Diplomacy
- synonyms of Diplomacy
- Not dry
- opposite of dry
- antonyms of dry
- Joy
- synonyms of Joy
15Clue Preprocessing
- Translate clues to Google-friendly forms
- Type of dancing or Kind of dancing
- dancing
- Second sight (abbr.)
- Second sight
- abbreviations of Second sight
- Supermans admirer
- admirer of Superman
16Clue Preprocessing
- Translate clues to Google-friendly forms
- Couldnt move
- Could not move
- Could opposite of move
- Could antonyms of move
- Knight or Danson
- Knight
- Danson
17Clue Preprocessing
- Translate clues to Google-friendly forms
- Bosley and Arnold
- Bosley
- Arnold
- Append an s
- Henson, and others
or Henson,
and namesakes - Henson
- Append an s
18Results of Google-Querying
19Results of Google-Querying
- GCV excels at solving fill-in-the-blank and
pop-culture clues - Why?
- Though results are encouraging,using
keyword-based searchingis limited - Why?
20Populating the Crossword Grid
- Use a Depth-First Search (DFS) algorithm
- Fill in the crossword grid based on confidence
values of candidate words - At each iteration
- Select candidate word with highest confidence
valueamongst clues not yet placed - Attempt to fit candidate word into grid
- Halt when a solution is found or a dead-end occurs
21Populating the Crossword Grid
- When a dead-end occurs, what do we do?
- Backtrack Remove last word placed in grid
- Disadvantages?
- Backjump Identify culprit and remove all
wordsback to culprit word - Disadvantages?
22Populating the Crossword Grid
- When a dead-end occurs, what do we do?
- Extricating Backjump Identify and remove the
culprit - Disadvantages?
- How do we identifythe culprit?
23Extricating Backjumping
- Assign weights to the squares of the grid
- Square weights correspond to confidence valuesof
candidate words placed - e.g. Place TWAIN withconfidence value of 10at
5-Across
24Extricating Backjumping
- Weights of interlocking words are multiplied
25Extricating Backjumping
- Define grid weight of a word as the sum of each
individual square weight - e.g. TWAIN 100, NOW 72
26Extricating Backjumping
- When a dead-end occurs, the culprit is theword
with the lowest grid weight
27A Sampling of Crossword Puzzles
28A Sampling of Crossword Puzzles
29A Sampling of Crossword Puzzles
30A Sampling of Crossword Puzzles
31A Sampling of Crossword Puzzles
32A Sampling of Crossword Puzzles
33A Sampling of Crossword Puzzles
34A Sampling of Crossword Puzzles
35Results of Grid Solving
36Limitations of Keyword-Based Search
- Google and GCV use keyword-based tricksto
artificially improve result sets - Word frequency proximity to other words
- Additional keywords to help direct queries
togood candidate answers - e.g. synonyms of
- Grammatical and structural rearrangements
37Limitations of Keyword-Based Search
- Lack of precision in keyword-based search
- Irrelevant results in candidate answer lists
- Confidence values based on word
frequencyproduces many false positives - Correct answer is often buried in other
mediocre(and incorrect!) candidates
38In Conclusion....
- Other uses of theWeb as an automatedinformation
source? - Keyword-based searchis insufficient
- Lacks the means formachine-interpretableinformat
ion - Semantic Web