AI for Research: How ChatGPT Gave Me Courage to Code
This week’s post on Generative AI is from Assistant Professor of Business Analytics, Dr. Olga Biedova. Dr. Biedova is a member of the School of Business Generative AI Taskforce.
In this post, Dr. Biedova discusses how she used AI in coding on a research project.
***********************************************************************
A few years ago, I came up with an exciting research idea that I thought could make a case for more transparency around cybersecurity incidents. After some brainstorming, my research question took shape: How does a ransomware disclosure affect a company’s stock price? But I faced two big problems. First, I couldn’t find good data on ransomware incidents. Second, I didn’t know much about the ideal analysis method—event studies. Around that time, OpenAI launched ChatGPT-3.5, the first version to really capture public attention. But I wasn’t an early adopter of generative AI, so I went ahead with my research using the usual mix of resources available at the College and online.
For the first problem, I manually scoured the Internet, SEC filings, and every open database I could find on cybersecurity incidents. It’s hard to estimate the hours spent on this, but I’m sure the number is in the hundreds. Later, I realized that ChatGPT – especially its newer versions with search capabilities—could have saved me a lot of that time. You can prompt it to list NYSE- or NASDAQ-traded companies with confirmed ransomware attacks, and it responds in minutes. Is it perfect? No. Would you still need to verify each company? Yes. But in cases where you don’t need an exhaustive dataset, starting here is better than starting from scratch.
For my lack of experience with event studies, I hoped to lean on my limited R skills and find a pre-existing event study library. However, no open-source tools were available, and the only tool I found required a monthly subscription. Getting even preliminary results was a slog, and I had to spend days formatting the input data to meet the tool’s specific requirements. Any slight change required hours to adjust, and what bothered me most was the tool’s “black box” nature. I couldn’t see the code, so I just had to trust that it was working correctly. I knew I needed my own program for full transparency, but I didn’t have enough expertise to create it in a reasonable timeframe.
I put the project on hold until this summer when I decided to give ChatGPT a try for coding help. First, I asked it to generate some charts, and within 20 minutes and a few prompts, I had R code that worked exactly as I needed. The next task was more ambitious: I asked ChatGPT to write code for an event study. I explained, “I have a list of ransomware incidents in public companies and want to see how ransomware disclosure affects their stock prices. What data do you need from me?” ChatGPT listed the inputs it needed, told me to upload my files, and guided me step-by-step. After a few rounds of back and forth, I had a working code for a basic event study with detailed comments explaining each step—and even some debugging tips to catch potential errors.
After putting off this task for over a year, it was surreal to get it done in just a few days with ChatGPT’s help. I knew generative AI could assist with coding, but I didn’t realize how much it could simplify the process. At the same time, I saw firsthand that AI isn’t perfect (which makes me less worried about my skills becoming obsolete anytime soon). By the time I got ChatGPT involved, I had a solid understanding of the event study method, which helped me spot and correct some of its mistakes. My basic R knowledge also came in handy.
Here are some examples of the adjusting prompts I had to make:
“Right now, four companies are returned as NA. This is because the incident happened during a non trading day. I want cases like this to be handled differently. Take the next trading day to evaluate the effect of the incident.”
“The T-test you suggested just tests the entire column. Instead, it should test AAR for a specific day in the event window.”
“This test does not make any sense. rank() function just re-arranges the values 1-97. But the average stays the same.”
“It looks like you are running a T-test on AARs. It should be a test on ARs instead.”
In the end, I’m amazed at how generative AI sped up my research. While it’s not without flaws, AI helped me get past technical hurdles that would have delayed my project for months.