Sadly all my best text encoding stories would make me identifiable to coworkers so I can’t share them here. Because there’s been some funny stuff over the years. Wait where did I go wrong that I have multiple text encoding stories?
That said I mostly just deal with normal stuff like UTF-8, UTF-16, Latin1, and ASCII.
My favourite was a junior dev who was like, “when I read from this input file the data is weirdly mangled and unreadable so as the first processing step I’ll just remove all null bytes, which seems to leave me with ASCII text.”
Sadly all my best text encoding stories would make me identifiable to coworkers so I can’t share them here. Because there’s been some funny stuff over the years. Wait where did I go wrong that I have multiple text encoding stories?
That said I mostly just deal with normal stuff like UTF-8, UTF-16, Latin1, and ASCII.
My favourite was a junior dev who was like, “when I read from this input file the data is weirdly mangled and unreadable so as the first processing step I’ll just remove all null bytes, which seems to leave me with ASCII text.”
(It was UTF-16.)
You’ve got to make sure you’re not over-specializing. I’d recommend trying to roll your own time zone library next.