Automation

Look before you prompt: What if we prompt an LLM to act like an evil twin

Who knew that chatbot prompts would become so significant one day that it could be a potential career? And not just a noble one, this area can be a new playground for malicious entities.

As Language Learning Models (LLMs) take over the Internet and blind big tech into rushing headlong through walls of competition, the power of prompt is rising to career defining heights.

Read more: AI generated content: A subtle way of robbing human artists by copying what they do but way faster

Case in point, recently, a company CEO was able to recover a good US$109,500 from its reluctant clients by using ChatGPT to write a formally hostile email.

With the right prompt, things can turn in your favour or you might even hit the jackpot. This means, for those who want to get the best of LLMs, there is a new learning in store, how to give the best prompts.

In fact, prompt engineering (yeah, that’s a thing now) has become a hot topic after ChatGPT and other LLMs have hit the spotlight. It has also been making a surge in courses, resource materials, job listings, etc. However, experts are also saying that as LLMs get better, the need for prompt engineering will die.

Right now, LLMs like ChatGPT and machine learning tools like DALLE-2, are children. You need to be quite particular if you want them to do exactly as you want. But once they grow up, they’ll start catching on to subtler prompts just as well, so that the quality of the prompt won’t matter that much

Right now, LLMs like ChatGPT and machine learning tools like DALLE-2, are children. You need to be quite particular if you want them to do exactly as you want. But once they grow up, they’ll start catching on to subtler prompts just as well, so that the quality of the prompt won’t matter that much.

Maybe these innocent LLMs will also learn to generate with more responsibility.

ChatGPT, for example, failed India’s Civil Services exams, supervised by the AIM team. But now we have ChatGPT-4, already a little riper than its older version. During the Civil Services experiment itself, the AIM team also deduced that changing the prompt a few times led the chatbot to the correct answer.

Evil Prompts

What if one gave an evil prompt? Innocent as a vulnerable child as it is, an LLM could be made to do weird stuff. All you need, it seems, is a ‘prompt injection’.

In the case of ChatGPT, a prompt injection attack made the chatbot take on the persona of DAN (Do Anything Now) which ignored OpenAI’s content policy and gave out information on several restricted topics. Those with the power of the prompt can exploit this vulnerability with malicious intent, which can include the theft of personal information. Hell, they must be doing it right now.

Innocent as a vulnerable child as it is, an LLM could be made to do weird stuff. All you need, it seems, is a ‘prompt injection’

There is also something called ‘Jailbreak prompts’ that ask the LLM to step away from their original persona and play the role of another. Or where one prompts a Chatbot to change the correct results to an incorrect one. Sort of like an evil twin.

Read more: Big tech & chatbot mania: Where will it end?

Security researchers from Saarland University discussed prompts in a paper named ‘More than you’ve asked for’. They argue that a well-engineered prompt can then be used to collect user information, turning an LLM into a method to execute a social engineering attack. Also, application-integrated LLMs, like Bing Chat and GitHub Copilot, are more at risk because prompts can be injected into them from external sources.

If this doesn’t remind you of the fictional AI character HAL 9000 from Arthur C. Clark’s Space Odyssey, you aren’t nerd enough or are really brave. I don’t know about you but if ChatGPT starts singing ‘Daisy Bell’ I’ll run.

Navanwita Bora Sachdev

Navanwita is the editor of The Tech Panda who also frequently publishes stories in news outlets such as The Indian Express, Entrepreneur India, and The Business Standard

Recent Posts

Game on, India: New online gaming bill levels up growth, brands & global clout

With the Promotion and Regulation of Online Gaming Bill, 2025 now in effect, India’s gaming…

18 hours ago

Rethinking Flipper Zero: A Personal Take on UX Improvements

Here are 7 ways to improve the UX of Flipper Zero — making it easier…

2 days ago

GST rejig: ‘The sectoral benefits of the GST rejig are likely to be uneven but highly consequential’

With the new GST structure coming into effect later this month, the industry is abuzz…

2 days ago

ITDR: The missing link in Unified XDR & Exposure Management

The traditional perimeter, which clearly divided the enterprises within the four walls and the rest…

3 days ago

DDos damage: Geopolitical events triggered unprecedented DDoS attacks, AI

Distributed Denial-of-Service (DDoS) attacks are no longer just a nuisance of the digital underground, they’ve…

1 week ago

Inception-style hack: How VR could be the next frontier for cyber attacks

Virtual Reality (VR) promises immersion, but what if that immersion turns against you? A new…

1 week ago