Uhaw Pa Sa Camel

 
d'Doc
Alabang, Muntinlupa City, Philippines
Beer-loving Gunner extraordinaire, perennial vocalist, guitarist, dog person, and wet kisser in one neat li'l package.

>> VIEW MY COMPLETE PROFILE

 

Subscribe to
Posts [Atom]  

Previous Posts

Powered by Blogger

 
Thursday, November 04, 2010

Python Short Hacking Tip #3: Know your encoding

 
It is next to impossible to determine what encoding was used just by looking at a string of bytes. The second best thing for us is knowing whether the string is encoded using a certain specific encoding.

def is_encoding(enc, s)
try:
s.decode(enc)
return True
except UnicodeDecodeError:
return False


Sample run:
>>> is_encoding('utf-8', u'Hello World \xdc'.encode('iso-8859-1'))
False


Take note that if the characters in the byte string are all in the ascii set, is_encoding will return true even for the call above.

>>> is_encoding('utf-8', u'Hello World'.encode('iso-8859-1'))
True

Subscribe to
Posts [Atom]

 
 

0 Comments:

Post a Comment

<< Home