How to simulate this special recursive update and merge of data in SQL
September 28th, 2007 By Frank Zhou
The sample input:
create table T (C1 varchar2(10), C2 varchar2(10));
insert into T values('P501', 'C101');
insert into T values('P501', 'C102');
insert into T values('P502', 'C105');
insert into T values('P503', 'C102');
insert into T values('P503', 'C103');
insert into T values('P503', 'C104');
insert into T values('P504', 'C105');
insert into T values('P504', 'C106');
insert into T values('P505', 'C106');
insert into T values('P505', 'C107');
insert into T values('P506', 'C108');
The following is the recursive “update & merge” of data example. our goal is to achieve this in a single SQL. If any data in C2 belongs to more than one value in C1 then merge C1:
SQL> break on c2;
SQL> select c1, c2 from t order by c2;
C1 C2
---------- ----------
P501 C101
P501 C102
P503
P503 C103
P503 C104
P502 C105
P504
P504 C106
P505
P505 C107
P506 C108
So C102 belongs to both 'P501' and 'P503'. Let's merge 'P501' and 'P503' and name the new one 'P501':
SQL> update t set c1='P501' where c1='P503'; SQL> select c1, c2 from t order by c2; C1 C2 ---------- ---------- P501 C101 P501 C102 P501 P501 C103 P501 C104 P502 C105 P504 P504 C106 P505 P505 C107 P506 C108
Now since 'C105' belongs to both 'P502' and 'P504', let's merge them:
SQL> update t set c1='P502' where c1='P504'; SQL> select c1, c2 from t order by c2; C1 C2 ---------- ---------- P501 C101 P501 C102 P501 P501 C103 P501 C104 P502 C105 P502 P502 C106 P505 P505 C107 P506 C108
Since 'C106' now belongs to both 'P505' and 'P502', let's merge them:
SQL> update t set c1='P502' where c1='P505'; SQL> select c1, c2 from t order by c2; C1 C2 ---------- ---------- P501 C101 P501 C102 P501 P501 C103 P501 C104 P502 C105 P502 P502 C106 P502 P502 C107 P506 C108
Now we have merged all the C1 rows...
SQL> break on c1;
SQL> select c1 , c2 from t order by c1;
C1 C2
---------- ----------
P501 C101
C102
C102
C104
C103
P502 C105
C105
C106
C106
C107
P506 C108
Here is what we want as the final answer:
C1 C2 ---- ------------------------- P501 C101, C102, C103, C104 P502 C105, c106, C107 P506 C108
---------------------------------SQL Solution---------------------------
The following SQL pattern can be used to simulate the recursive "update & merge" of data example from above.
SELECT c1,trim(BOTH ','FROM
regexp_replace(XMLAgg(XMLElement(X,c2) order by c1),'<X>|</X><X>|</X>',',')) c2
FROM
(SELECT DISTINCT MIN(substr(str,1,
CASE WHEN instr(str,',')>0
THEN instr(str,',')-1
ELSE length(str) END)) OVER (PARTITION BY c1) c1,c2
FROM
(SELECT c1, c2, path,ltrim(sys_connect_by_path(c1,','),',') str
FROM
(SELECT c1,c2,rn,ltrim(sys_connect_by_path(c1,','),',') path
FROM(SELECT c1,c2,row_number() over (PARTITION BY c2 ORDER BY c1) rn
FROM t )
START WITH rn = 1
CONNECT BY c2 = PRIOR c2
AND PRIOR rn = rn -1
)
CONNECT BY PRIOR substr(path,instr(path,',',-1)+1) =
substr(path,0,instr(path,',',1)-1)
AND c1 > PRIOR c1
)
)
GROUP BY c1;
C1 C2
------ ------------------------------
P501 C101,C102,C103,C104
P502 C105,C106,C107
P506 C108
