How to solve the Anagrams Puzzle in SQL
June 8th, 2010 By Frank Zhou
The following is an interesting puzzle posted on the programming praxis website:
Anagrams
Words that are formed from the same set of letters are anagrams of each other. For instance, pots, post, stop, spot, opts, and tops are anagrams.
Your task is to write a program that, given a dictionary and an input word, prints all the anagrams of the input word.
You are also to determine the largest anagram class in your dictionary.
The flat file “C:\oracle\word.txt” contains the following words :
pots
post
stop
spot
opts
tops
pares
parse
pears
rapes
reaps
spare
spear
angor
argon
goran
grano
groan
nagor
orang
organ
rogan
—————————————————————————-SQL Solution——————————————————————————————
COLUMN largest_anagram_class FORMAT A68
variable input_str varchar2(38)
exec :input_str:='tops'
create or replace directory tmp as 'C:\oracle'
CREATE TABLE words (str VARCHAR2(38))
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY tmp
ACCESS PARAMETERS
(
records delimited by newline
fields terminated by whitespace
missing field values are null
( str
)
)
LOCATION ('word.txt')
)
REJECT LIMIT UNLIMITED;
--------------------------------------------------------------Find all the anagrams of the input word ---------------------------------------------------------
WITH DATA AS
(SELECT ordered_str, str, COUNT(str) OVER (PARTITION BY ordered_str) as cnt
FROM
(SELECT DISTINCT REPLACE(LISTAGG(trim(column_value), ',') WITHIN GROUP (ORDER BY trim(column_value)) OVER (PARTITION BY str), ',') as ordered_str, str
FROM words a,
xmltable(rtrim(REGEXP_REPLACE(a.str,'(.)', '"\1",'), ','))
)
),
INPUT AS (
SELECT DISTINCT REPLACE(LISTAGG(trim(column_value), ',') WITHIN GROUP (ORDER BY trim(column_value)) OVER (PARTITION BY str), ',') as ordered_input
FROM (SELECT :input_str as str FROM dual) a,
xmltable(rtrim(REGEXP_REPLACE(a.str,'(.)', '"\1",'), ','))
)
SELECT str, CASE WHEN ROWNUM = 1 THEN cnt END as cnt
FROM DATA
WHERE ordered_str IN
(SELECT ordered_input
FROM INPUT);
STR CNT
-------------------------------------- ----------
tops 6
pots
opts
spot
post
stop
6 rows selected.
SQL>
---------------------------------------------------Determine the largest anagram class in the dictionary---------------------------------------------------------
WITH DATA AS
(SELECT ordered_str, str, COUNT(str) OVER (PARTITION BY ordered_str) as cnt
FROM
(SELECT DISTINCT REPLACE(LISTAGG(trim(column_value), ',') WITHIN GROUP (ORDER BY trim(column_value)) OVER (PARTITION BY str), ',') as ordered_str, str
FROM words a,
xmltable(rtrim(REGEXP_REPLACE(a.str,'(.)', '"\1",'), ','))
)
)
SELECT '[ '||anagrams||' ]' AS largest_anagram_class, max_ct
FROM
(SELECT DISTINCT LISTAGG(str, '->') WITHIN GROUP (ORDER BY str ) OVER (PARTITION BY ordered_str) AS anagrams, cnt, MAX(cnt) OVER() AS max_ct
FROM DATA
)
WHERE max_ct = cnt;
LARGEST_ANAGRAM_CLASS MAX_CT
-------------------------------------------------------------------- ----------
[ angor->argon->goran->grano->groan->nagor->orang->organ->rogan ] 9
SQL>

June 24th, 2010 at 9:05 am
SELECT * FROM words WHERE TRANSLATE(str,’$'||:input_str,’$') IS NULL;
June 24th, 2010 at 9:10 am
June 24th, 2010 at 11:40 am
Hi newkid,
select TRANSLATE(’angor’,'$’||’organ’,'$’) from dual
VS
select TRANSLATE(’aaaaangor’,'$’||’organ’,'$’) from dual
You will get the same result in both cases, So Transalte function doesn’t work here either.
Also the query’s performance will be slow when using connect by in this case.
Thanks,
Frank
June 24th, 2010 at 1:06 pm
Well, I might have a misunderstanding of “the same set of letters”. How about this?
June 24th, 2010 at 1:19 pm
SELECT w,wmsys.wm_concat(str),COUNT(*) cnt FROM ( SELECT str,REPLACE(SYS_CONNECT_BY_PATH(c,'/'),'/') w FROM ( SELECT str ,SUBSTR(str,rn,1) c ,ROW_NUMBER() OVER(PARTITION BY str ORDER BY SUBSTR(str,rn,1)) rn FROM words, (SELECT ROWNUM rn FROM DUAL CONNECT BY ROWNUM<=(SELECT MAX(LENGTH(str)) FROM words)) WHERE rn<=LENGTH(str) ) WHERE CONNECT_BY_ISLEAF=1 START WITH rn=1 CONNECT BY str=PRIOR str AND rn = PRIOR rn+1 ) GROUP BY w ORDER BY cnt DESC;