Chilkat HOME .NET Core C# Android™ AutoIt C C# C++ Chilkat2-Python CkPython Classic ASP DataFlex Delphi ActiveX Delphi DLL Go Java Lianja Mono C# Node.js Objective-C PHP ActiveX PHP Extension Perl PowerBuilder PowerShell PureBasic Ruby SQL Server Swift 2 Swift 3,4,5... Tcl Unicode C Unicode C++ VB.NET VBScript Visual Basic 6.0 Visual FoxPro Xojo Plugin
Demystifying ASP Code Pages, Response.Write, Response.BinaryWrite, Strings, and Charsets
ASP example that (hopefully) demystifies some issues regarding the use of international charsets in literal strings and Response output. <% @CodePage = 65001 %> <% Response.CodePage = 1252 %> <% ' How to use any charset / code page in ASP. ' The first directive, "@CodePage" is the code page of the ASP file. ' When you save the .asp file using your editor, save it in this code page. ' This is the charset used for literal strings within your ASP scripting. ' ' The Response.CodePage directive indicates the charset encoding emitted ' by Response.Write. ' The charset specified in the <meta> tag, as shown below, must match ' the Response.CodePage. %> <html> <head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></head> <body> <% ' Let's examine this seemingly simple statement: myStr = "eèéêë" ' myStr is a String. Strings in ASP are Unicode (2 bytes/char). ' This particular .asp file is saved in the utf-8 encoding (our @CodePage is utf-8). ' Therefore, the ASP scripting engine converts the literal string from utf-8 to ' Unicode. (utf-8 is actually the multibyte encoding of Unicode, but it is a different ' character encoding than the 2-byte/char Unicode) ' Here's another seemingly simple statement: Response.Write myStr & "<br>" ' Response.Write accepts a Unicode string argument and emits it to standard output ' using the Response.CodePage. In this case, it converts from Unicode to windows-1252 ' and emits windows-1252, which is 1-byte/char. ' What about this?: Response.Write "eèéêë<br>" ' The utf-8 literal string is emitted as windows-1252. ' (Internally, it is probably converted to Unicode, passed to Response.Write, and ' then converted and emitted as windows-1252.) ' What about this?: Response.Write "eèéêë私はガラスを食<br>" ' We are able to add Japanese characters to our literal string, because the ' .asp is saved using utf-8. However, the Response.CodePage is windows-1252, ' and there are no glyphs available for Japanese characters in windows-1252. ' The characters cannot be converted, so they are replaced with question marks. ' The Chilkat String object allows the string to be retrieved as Unicode, ' or in any charset: set cks = Server.CreateObject("Chilkat_9_5_0.CkString") cks.Append "eèéêë" ' The "Str" property is the current value of the string in Unicode. ' It can be passed to Response.Write or assigned to a string variable. Response.Write cks.Str & "<br>" ' We can also emit the string in any code page. For example: Response.BinaryWrite cks.EmitMultibyte("windows-1252") ' EmitMultibyte works here because (1) BinaryWrite simply passes the bytes to the ' response unmodified, and (2) we are emitting windows-1252, which matches ' both the Response.CodePage *and* the charset specified in the <meta> tag. Response.Write "<br>" ' What about this:? Response.BinaryWrite cks.EmitMultibyte("utf-8") ' We don't get what we expect because Response.BinaryWrite emits the bytes exactly ' as it receives them. EmitMultibyte is emitting utf-8, and ' BinaryWrite passes the bytes untouched to the response output. ' However, the browser is expecting windows-1252 (because ' of our <meta> tag, so the bytes are incorrectly interpreted byte-for-byte as ' windows-1252 characters). %> </body> </html> |
© 2000-2024 Chilkat Software, Inc. All Rights Reserved.