Here’s a quick way to generate dummy data for performing tests that require lots of records.
SQL> DROP TABLE lots_o_data; Table dropped. SQL> CREATE TABLE lots_o_data 2 ( id NUMBER 3 , name VARCHAR2(5) 4 , category VARCHAR2(26) 5 ); Table created. SQL> INSERT INTO lots_o_data VALUES (1, 'Lotso', 'Wow this is a lot of data.'); 1 row created. SQL> INSERT INTO lots_o_data VALUES (2, 'More', 'Bunches and bunches.'); 1 row created. SQL> COMMIT; Commit complete. SQL> INSERT INTO lots_o_data 2 SELECT a.id, a.name, b.category 3 FROM lots_o_data a CROSS JOIN lots_o_data b; 4 rows created. SQL> COMMIT; Commit complete. SQL> INSERT INTO lots_o_data 2 SELECT a.id, a.name, b.category 3 FROM lots_o_data a CROSS JOIN lots_o_data b; 36 rows created. SQL> COMMIT; Commit complete. SQL> INSERT INTO lots_o_data 2 SELECT a.id, a.name, b.category 3 FROM lots_o_data a CROSS JOIN lots_o_data b; 1764 rows created. SQL> COMMIT; Commit complete. SQL> INSERT INTO lots_o_data 2 SELECT a.id, a.name, b.category 3 FROM lots_o_data a CROSS JOIN lots_o_data b; 3261636 rows created. SQL> COMMIT; Commit complete. SQL> SELECT COUNT(*) 2 FROM lots_o_data; COUNT(*) ---------- 3263442 SQL> SPOOL OFF
Note the following
- With only four INSERT … SELECT statements based on a Cartesian product, we were able to generate over 3 million rows of data.
- A few more INSERT … SELECT statements could easily result in billions of rows of data.
If you need a lot of data for performance testing or capacity testing, this is an easy way to get however much you require.
I have been working on building a data lake environment in anticipation of a series of large data feeds I’m expecting soon and found this to be a useful approach to preparing the environment.