
Like many, I have never met a chatbot that I completely trust. Not only are they prone to hallucinating by making up facts, but you can never be sure what their parent companies do with the information you give them. Most AI companies say they use your data to further train their models, but they anonymize it first. However, you just have to take their word for it on the matter.
Still, chatbots can be useful for summarizing and explaining complicated information, such as that contained in many bank statements, medical reports, and mortgage contracts.
So if you choose to upload sensitive documents like this, you should take steps to redact as much personal information as possible, not only to protect your privacy from the AI company, but also to protect against future data breaches that could cause your medical and financial records to be spread on the dark web. This is how.
The wrong way to edit your sensitive data
First things first, there is a right way and a wrong way to redact sensitive information, particularly from PDF files, which is the format in which most of our bank statements, medical records, and contracts are found. As some attorneys general and lawyers have learned the hard way, incorrectly redacting PDF files provides essentially no protection.
The “wrong” way is to use a PDF reader’s markup tools, such as the pen or highlighter, to scribble or draw black bars in the text. While these methods can hide text in plain sight, a simple mouse movement over the dark line of text to select it, followed by copy and paste, can often recover it. More advanced PDF tools can also easily remove any stylus scratches and black highlights completely, revealing the original text underneath.
In short, the “wrong” way is similar to placing a piece of electrical tape over the lines of a document: it obscures the lines from view, but can be easily peeled off. So if you’re using this redaction method before uploading your sensitive documents to ChatGPT, your instinct is in the right place, but your execution is off, and that leaves your sensitive, personally identifiable information highly vulnerable.
The Right Way to Redact Your Sensitive Information Before Uploading Documents to AI Chatbots
The correct way to redact documents digitally is to use a tool specifically designed to destroy the underlying data within the internal code of the PDF. These editorial tools literally remove the underlying text, making it nearly impossible to recover.
The easiest editorial tool I’ve discovered is built into Apple’s Preview app. Preview is the default macOS PDF reader (it’s also available on iPhone, but the iOS version lacks a redaction tool). If you’re a Windows user, note that that platform’s native PDF viewer, Microsoft Edge, doesn’t offer such a feature, although there are several third-party apps, such as Adobe Acrobat Pro (subscription required) and PDFgear (free), that offer redaction tools.
I’ll describe here how to use Apple’s Preview editorial tool, but most editorial tools in other apps work similarly.
How to redact your sensitive information before uploading documents to AI chatbots
The important thing to keep in mind about redaction tools is that they are designed to destroy the text you want to redact, making it unreadable. Therefore, always be sure to first make a copy of the document you plan to upload to a chatbot and redact the information on the copy.
Always keep the original, unedited document on your computer, so you can access its full content. If you don’t do this, you will lose the ability to read the original document in its entirety, because you will not be able to remove redaction from the text once it is redacted.
Once you have made a copy of the document, you are ready to edit it. Here’s how:
- Open the copy of the PDF document in the Preview app on your Mac.
- From the menu bar, select Tools>Compose.
- A warning will appear advising you that any “redacted content is permanently deleted.” Click OK to dismiss the warning.
- Now, move the text selection cursor over any text you want to compose. This may include your name, address, email, telephone number, Social Security number or any other confidential information. As you drag the text selection tool over your selection, black bars with gray Xs will be placed across the text. This tells you that the text is marked for redaction.
- Continue editing any text you want throughout the document.
- Once you have marked all the text you have redacted, you can move your mouse over the black bars to see the text to be redacted below, if you wish. You can also drag the text cursor back over the text to deselect it and edit it.
- If you are happy with your editorial selection, save the document. But keep in mind that even after saving, the selected text has not yet been drafted.
- Now that the document has been saved, to complete editing, close the PDF (keyboard shortcut: Command-W). Once you do this, the text below the editorial marks will be destroyed.
When you open the document again, you’ll see permanent black lines with gray Xs where the previous text was. But the text below those lines has been destroyed and should now be unrecoverable.
Some things to keep in mind
While the above method should ensure that the selected text has been redacted correctly so that it is not retrievable by an AI chatbot or anyone accessing the redacted document in the future, redacting personally identifiable information in a document does not necessarily keep your identity anonymous to ChatGPT and other AI chatbots.
That’s because even if you hide all your personally identifiable information in the document, if you’re logged into ChatGPT, OpenAI will of course know that your account is the one that uploaded that March bank statement or that medical report.
This means that if you want as much anonymity as possible, you should not only securely redact sensitive information in your documents before uploading them to AI chatbots, but you should also not upload them to any AI chatbots that you are logged into. As an added measure, it’s also a good idea to remove metadata from a PDF before uploading it, as this metadata may include your name or other information.

