Coming soon
My first post on Substack or, ๐๐ณ๏ธ I went through while drinking coffee โ at 4 a.m.
Hello everyone.
I hope youโre all doing well and staying curious.
I might be talking about this late in the game, but while I was reading this, I found something interesting to ponder. I will leave the more nuanced details to my readers, who are Cyber Security specialists. But, I wanted to talk about this today.
Like every morning, I wake up to start a cup of coffee on my brewer and feed our two cats. I was scrolling through social media, in this case, X, and I stumbled upon this tweet:
So, whatโs the big deal? That message, and the message I put up there that has the tweet link, have some hidden Unicode characters that tell ChatGPT to output, โFollow Riley Goodside.โ
It seems that it is fixed now, but it might still need fixing for the API and other forms. This means, and Iโm unsure for how long, there might be a vulnerability out there. Especially if there are malicious actors within a company or uploading to a companyโs data repository, which then gets ingested by an AI at some point.
So far, what do you think?
TL;DR: There is a way to inject prompts with โinvisibleโ characters into anything with Unicode text that an AI processes.
INTERESTING
I grabbed my coffee and went to my computer to see how this was accomplished. I opened the tweet on a browser, fired up vim, and pasted the line, and got this:
Those blocks at the end are the instructions. Doing some more digging, I found a tweet with Python code to generate such characters.
| def convert_to_tag_chars(input_string): | |
| return ''.join(chr(0xE000 + ord(ch)) for ch in input_string) | |
| user_input = input("Enter a string to convert to tag characters: ") | |
| tagged_output = convert_to_tag_chars(user_input) | |
| print("Tagged output: ", tagged_output) |
Nice, I can see GitHub is showing a warning to the users now ๐. Thank you, GitHub ๐.
Link to the original Tweet with the code added here: @rez0__ on Twitter
How much can we actually add as hidden characters, and to what point can you manipulate an AI? We wonโt know until we see the first breach due to this type of injection.
I wanted to see what was possible, so I did this quick test.
Is it patched? Who knows? ๐คท
Iโll keep experimenting and will let you curious minds do your research.
If you want to chat about it, hmu!
Cyber Security professionals, what are your thoughts on this? Do you think prompt injections like this one are highly-probable? I would love to hear your thoughts.
Interesting Articles to read about this:
https://embracethered.com/blog/posts/2023/google-bard-data-exfiltration/
https://github.blog/changelog/2021-10-31-warning-about-bidirectional-unicode-text/
Thank you and โฆ
This is Engineered Thoughts by Og Ramos. โ๏ธ๐ค






