Hi Folks,
Thank you for your excellent comments. I added them to my list. Below is the updated list. Is it missing anything? /Roger
Here are some things to know about base64:
1. Base64 encoding is specified in RFC 4648.
2. Base64-encoded data is plain text.
3. There are several base64 alphabets:
a. The standard base64 alphabet consists of these 64 ASCII characters: a-z, A-Z, 0-9, +, / and the equals symbol ( = ).
b. In the URL and Filename safe base64 alphabet, the plus symbol ( + ) is replaced with a minus sign ( - ) and the forward slash symbol ( / ) is replaced with an underscore symbol ( _ ).
c. Other base64 alphabets are called non-standard, or custom, base64 alphabets.
d. As a common extension, base64 can also contain arbitrary whitespace, which is ignored.
4. Any type of file, from plain text to binary executable, can be base64-encoded.
5. Base64-encoding enables binary objects to be transported using text-based protocols, such as SMTP.
6. People have created regular expressions that specify the pattern of base64 text. It is possible for text to match the regular expression and yet not be base64. That is, text that appears to be base64, may not be.
7. External information must be provided to tell whether text is base64.
8. If external information says that text is base64, the external information might be incorrect, either by accident or by intent.
9. Performing base64 decoding on data that is not base64 might cause harm.
a. Is there a way to perform base64 decoding that is guaranteed to never cause harm?
10. There is nothing in base64-encoded data which identifies the media type of the data. External information must be provided to tell the media type. Without external information, the media type must be discovered (if possible).
11. If there is external information about the media type of the data, the external information might be incorrect, either by accident or by intent.
12. Processing an object, assuming it is of media type A when it is actually of media type B, might cause harm.
13. Decoding base64 text is a trivial task.
14. Data that is base64-encoded cannot be directly viewed, used, or inspected.
15. Compared to data that is not encoded, viewing/using/inspecting base64-encoded data requires an additional step: decode and then view/use/inspect.
16. Without external information about the media type of the data that is base64-encoded, there are two additional steps to viewing/using/inspecting base64-encoded data: decode, determine the media type (if possible), and then view/use/inspect.
17. Text formats such as XML and JSON cannot carry binary data. If binary data must be carried by a text format, the binary data can be base64-encoded, thus generating plain text, and then the base64-encoded plain text can be carried by the text format.
18. The size of base64-encoded data is roughly 4/3 the size of the original data.