python guess encoding of string and convert

def guessEncodingAndDecode( data):
guess_list=[‘utf-8′,’iso8859-1′,’iso8859-2′,’iso8859-15′,’ascii’]# add more if u want
encoding = ‘iso8859-1’

for best_enc in guess_list:
try:

encoding = best_enc
htmlContent = data.decode(encoding).encode(‘utf-8’)
break

except:
pass

return htmlContent

Advertisements
About

Work is fun!

Tagged with: , , ,
Posted in python, Tips And Tricks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Disclaimer
All content provided on this "tareqalam.com" blog is for informational purposes only. The owner of this blog makes no representations as to the accuracy or completeness of any information on this site or found by following any link on this site. The owner of [tareqalam.wordpress.com] will not be liable for any errors or omissions in this information nor for the availability of this information. The owner will not be liable for any losses, injuries, or damages from the display or use of this information.
February 2010
M T W T F S S
« Sep   May »
1234567
891011121314
15161718192021
22232425262728

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 23 other followers

%d bloggers like this: