How To Use Generative AI and Python to Create Designer Dummy Datasets | by Mia Dwyer | Apr, 2024


A Simple Guide for Practical Applications

Have you ever had a need for a dataset that doesn’t easily exist? Wanted to easily generate data that matches your exact requirements for interviewing prospective data science candidates, software testing + development, or training models? Or what about just wanting the right data to use to demonstrate skills + techniques for a Medium article (that doesn’t violate copyright laws)?

Enter dummy data! 📊✨

Image created by me, using DALL-E

Until recently, creating dummy datasets was somewhat tedious and arduous, the technical folks among us could generate if with expertly written python code, but coding up all your requirements by hand can be time intensive and has a high technical barrier to entry.

Let’s say we have a use case where we want to test a candidate applying for data science to a fintech, and there are real world patterns we want them to be able to identify and discuss, but for privacy reasons we cannot share actual customer transaction data externally.

The solution? Leverage the power of Generative AI to expertly craft complex python code to output our ✨Designer Dummy Datasets✨

Let’s look at how we can prompt GPT4 to generate a dataset for us that meets all of our exact, and somewhat tedious, requirements:

Hi there! You are my expert python programmer and data scientist extraordinaire. 
I need to generate a "designer dummy dataset" that meets the following conditions and specifications,
can you please write the python code for me to generate it?

The dataset is transactions in 2019, 2020, and 2021
I want the dataset to contain the following columns: id, transaction_timestamp, user_id, amount, merchant, network, card_type.
The merchant_name should be either: Walmart, Netflix.com, Starbucks, Home Depot, 7/11, Dunkin Donuts, Trader Joe's, and Amazon.com
The user_id should be between 1 and 100 - the amount should be 9.99 for every Netflix.com purchase, less than $10 for Starbucks and Dunkin Donuts, between $25 and 500 for Walmart, Amazon.com, and Home Depot, less than $25 for 7/11, and between $10 and $250 for Trader Joe's
There…

Mia Dwyer
We will be happy to hear your thoughts

Leave a reply