What is the problem?
Sometimes you are faced with a bad (poorly normalized) schema design. For example:

But, you may need the output to be in the following normalized form:

The correct solution may be to fix the schema design, but in most cases this will not be feasible.
Test Case 1:
We need to write our query to present the data as it should be stored, in spite of how it is being actually stored.
Let us build the following test case:
CREATE TABLE Cust_Phones
(
CustomerID INT PRIMARY KEY,
Phone1 VARCHAR(32),
Phone2 VARCHAR(32),
Phone3 VARCHAR(32)
);
INSERT into Cust_Phones (CustomerID, Phone1, Phone2, Phone3)
VALUES (1,’705-491-1111′, ‘705-491-1110’, NULL);
INSERT into Cust_Phones (CustomerID, Phone1, Phone2, Phone3)
VALUES (2,’613-492-2222′, NULL, NULL);
INSERT into Cust_Phones (CustomerID, Phone1, Phone2, Phone3)
VALUES (3,’416-493-3333′, ‘416-493-3330’, ‘416-493-3339’);
Commit;
Select * from cust_phones;

Now to normalize the output, we will write our query to use the unpivot operator as follows:
SELECT CustomerID, Phone
FROM
( SELECT CustomerID, Phone1, Phone2, Phone3 FROM Cust_Phones )
UNPIVOT ( Phone FOR Phones IN (Phone1, Phone2, Phone3));

The magic operator here is the “unpivot”
The “Phones” alias is saying: “extract a new row for every value you find in the columns Phone1, Phone2, and Phone3.
Test Case 2:
What if we have a more complicated case of 3 phone numbers with different types, like:

And we need to normalize the output to be as follows:

Let us build the following test case:
CREATE TABLE Cust_Phones2
(
CustomerID INT PRIMARY KEY,
Phone1 VARCHAR(32),
PhoneType1 CHAR(4),
Phone2 VARCHAR(32),
PhoneType2 CHAR(4),
Phone3 VARCHAR(32),
PhoneType3 CHAR(4)
);
INSERT into Cust_Phones2 VALUES (1,’705-491-1111′, ‘cell’, ‘705-491-1110’, ‘home’, NULL,NULL);
INSERT into Cust_Phones2 VALUES (2,’613-492-2222′, ‘home’, NULL, NULL, NULL, NULL);
INSERT into Cust_Phones2 VALUES (3,’416-493-3333′, ‘work’, ‘416-493-3330’, ‘cell’,’416-493-3339′, ‘home’);
commit;
select * from Cust_Phones2;

Now to normalize the output, we will write our query to use the unpivot operator as follows:
SELECT CustomerID, Phone, PhoneType
FROM
( SELECT CustomerID, Phone, PhoneType, Phones , PhoneTypes
, SUBSTR(Phones, LENGTH(Phones) – REGEXP_INSTR( REVERSE(Phones) , ‘[^0-9]’) + 2, 32) as idp,
SUBSTR(PhoneTypes, LENGTH(PhoneTypes) – REGEXP_INSTR(REVERSE(PhoneTypes) , ‘[^0-9]’) + 2, 32) as idpt
FROM ( SELECT CustomerID, Phone1, Phone2, Phone3, PhoneType1, PhoneType2, PhoneType3 FROM Cust_Phones2 )
UNPIVOT (Phone FOR Phones IN (PHONE1, PHONE2, PHONE3))
UNPIVOT (PhoneType FOR PhoneTypes IN (PhoneType1, PhoneType2, PhoneType3)) )
WHERE idp = idpt;

We used one “unpivot” operator to normalize the “phone” è produced number of rows = number of phones of each customer.
We used another “unpivot” operator to normalize the “PhoneType” è produced number of rows = number of phone types of each customer.
Note that if you run only the middle select query:
SELECT CustomerID, Phone, PhoneType, Phones , PhoneTypes
, SUBSTR(Phones, LENGTH(Phones) – REGEXP_INSTR( REVERSE(Phones) , ‘[^0-9]’) + 2, 32) as idp,
SUBSTR(PhoneTypes, LENGTH(PhoneTypes) – REGEXP_INSTR(REVERSE(PhoneTypes) , ‘[^0-9]’) + 2, 32) as idpt
FROM ( SELECT CustomerID, Phone1, Phone2, Phone3, PhoneType1, PhoneType2, PhoneType3 FROM Cust_Phones2 )
UNPIVOT (Phone FOR Phones IN (PHONE1, PHONE2, PHONE3))
UNPIVOT (PhoneType FOR PhoneTypes IN (PhoneType1, PhoneType2, PhoneType3))
You will get the following output:

Which is like a Cartesian product of the number of phones * the number of corresponding phone types
To get only the relevant rows, we added the last where condition:
WHERE idp = idpt;
The following part of the query:
SUBSTR(Phones, LENGTH(Phones) – REGEXP_INSTR( REVERSE(Phones) , ‘[^0-9]’) + 2, 32) as idp,
Tills us it is phone1 or phone2 or phone3. For example:
select substr(‘PHONE1’,LENGTH(‘PHONE1’) – REGEXP_INSTR(REVERSE(‘PHONE1′),'[^0-9]’)+2,32) from dual;
Will produce 1
While:
select substr(‘PHONE2’,LENGTH(‘PHONE2’) – REGEXP_INSTR(REVERSE(‘PHONE2′),'[^0-9]’)+2,32) from dual;
Will produce 2
and so on.
The following part of the query:
SUBSTR(PhoneTypes, LENGTH(PhoneTypes) – REGEXP_INSTR(REVERSE(PhoneTypes) , ‘[^0-9]’) + 2, 32) as idpt
Tills us it is phonetype1 or phonetype2 or phonetype3. For example:
select substr(‘PHONETYPE1’,LENGTH(‘PHONETYPE1’) – REGEXP_INSTR(REVERSE(‘PHONETYPE1′),'[^0-9]’)+2,32) from dual;
- Will produce 1
Select REGEXP_INSTR(REVERSE(‘PHONETYPE1′),'[^0-9]’) from dual;
- Will return 2, which is the position of the first non-numeric character from the end of the string.
Test Case 3:
What if you add a new phone number to the above table “Phone4” & “PhoneType4”,in this case you have to modify your queries. But is there a more dynamic way to build the UNPIVOT queries without advanced knowledge of the number of Phone/PhoneTypes? . This may be a subject of another future blog !
References:
A similar SQL Server case:

Donate to support our blogging work
$1.00
Thanks
Ahmed