Title: Title: Arial 28pt.
1PWB 518 Build International Applications With
PowerBuilder 10
Jin-You Zhu Sr. Software Engineer jyzhu_at_sybase.com
August 15-19, 2004
2The Enterprise. Unwired.
3The Enterprise. Unwired.
Industry and Cross Platform Solutions
Unwire People
Unwire Information
Manage Information
- Adaptive Server Enterprise
- Adaptive Server Anywhere
- Sybase IQ
- Dynamic Archive
- Dynamic ODS
- Replication Server
- OpenSwitch
- Mirror Activator
- PowerDesigner
- Connectivity Options
- EAServer
- Industry Warehouse Studio
- Unwired Accelerator
- Unwired Orchestrator
- Unwired Toolkit
- Enterprise Portal
- Real Time Data Services
- SQL Anywhere Studio
- M-Business Anywhere
- Pylon Family (Mobile Email)
- Mobile Sales
- XcelleNet Frontline Solutions
- PocketBuilder
- PowerBuilder Family
- AvantGo
Sybase Workspace
4What is Unicode?
- Unicode provides a unique number for every
character, no matter what the platform, no matter
what the program, no matter what the language. - 2 sets of Unicode
- UCS-2 use one 16bit unit (2 bytes) to represent
a character. (up to 65535 characters) - UCS-4 use one 32bit unit (4 bytes) to represent
a character. UCS-4 is a superset of UCS-2. It
includes more characters. - 3 popular Unicode Transformation Formats (UTF)
- UTF-8 Use 1 to 4 bytes to represent one Unicode
character. ASCII characters is the same as those
in ASCII. To represent UCS-2, need 1 to 3 bytes.
To represent UCS-4, needs 1-4 bytes. - UTF-16 Use 1 (UCS-2)or 2 (for UCS-4) 16bit unit
to represent one Unicode character. - UTF-32 Use 1 32bit unit to represent one Unicode
character.
5Why use Unicode ?
- Unicode allows a program or website to be
targeted for multiple platforms, languages and
countries. - It defines codes for all characters used in all
major languages today. - It is able to encode multilingual text.
- Unicode is the official way to implement ISO/IEC
10646. - It is being adopted by many of the industry
leaders. - It allows data transfer between different systems
without data corruption.
6Benefit Pitfalls of Using Unicode
- Unicode can handle text in any language or any
combination of languages. - You can process and show characters in
multi-language in the single form. - It is possible that one application fits for all
languages. - Conversion is only necessary on incoming and
outgoing data without corrupt. - No data lose when convert from any code page to
Unicode. - It simplifies operations on text because there is
no longer a need to keep track of what encoding
scheme is being used. - Disadvantages
- Because one character in Unicode take 2 bytes, it
consume more memory.
7PowerBuilder 10 Unicode Enabling
- PB10 uses Unicode internally. It can process and
display Unicode characters, which support
Multilanguage in your applications. - Database Support DBCS Unicode databases.
- PowerScript PB10 has more PowerScript functions
to process Unicode string and ANSI(DBCS) string. - PowerScript Manipulation of ANSI Unicode files
- DW/XmlDW Select/Insert/Update of Multilanguage
is supported. - PBNI two sets of interface are implemented. The
users have the choice to use Unicode API or ANSI
API. - Orca two sets of interface are implemented. The
users have the choice to use Unicode API or ANSI
API - External Function Support ANSI Unicode
parameters. - A migration tool is developed to help solve
migration issues.
8PB10 Supports ANSI Unicode Databases (1)
- ANSI/DBCS Database
- A database that uses ANSI (or DBCS codepage) as
its character set, such as CP1252 for European
language, CP932 for Japanese, CP936 for
Simplified Chinese. - Unicode Database
- A Unicode database is a database whose character
set is set to a Unicode format, such as UTF-8,
UTF-16. - All data in database is in Unicode format, and
any data saved to the database must be converted
to Unicode data implicitly or explicitly. - Unicode column
- A database that uses ANSI (or DBCS) as its
character set may use special data types to store
Unicode data. These data types are NCHAR,
NVARCHAR, / NVARCHAR2. Columns with this data
type can store Unicode data. Any data saved into
such a column must be converted to Unicode
explicitly.
9PB10 Supports ANSI Unicode Databases(2)
- In PB10, Most DB interfaces support Ansi
Unicode Databases. - ()-- Need a patch for EAServer 4.2.3/5.1.
10DB interface SYC SYJ
- A new dbparm (UTF8) is defined for SYC/SYJ
- UTF8 could be 1 or 0. Default value is 0.
- If set this dbparm to 0, DB driver will convert
the data to the client machines locale. Then
client will convert it to Unicode. - If set it to 1, the DB driver will gives data
back in Unicode for Multilanguage support. In
this case, the ASE server need to be specially
configured. - How?
- Sp_configure enable Unicode conversion 2
11DB interface ODBC
- For client/Server applications PB10 can consume
data from ANSI database and Unicode database. No
special setting is needed. - For N-tier applications if you use ODBC to
connect to ASA Unicode database through
connection cache, you need a special patch for
EAServer 4.2.3/5.1, which add a new connection
cache called ODBCU. With this new connection
handle, PB component can access Unicode data from
Unicode ASA database.
12DB interface O90
- For Client/Server applications PB10 can consume
data from Ansi database and Unicode database via
O90/O84. No special setting is needed. - For N-tier applications if you use O90 to
connect to Oracle 9 database through connection
cache, you need a special patch for EAServer
4.2.3/5.1, which add a new connection cache
called OCI_9U. With this new connection handle,
PB component can access Unicode data for Oracle.
13DB interface JDBC/OleDB/ADO.Net
- PB10 can consume data from Ansi database and
Unicode database. No special setting is needed.
14DB interface Informix Native
- PB10 can consume data from Ansi database of
Informix. No special setting is needed. - Note Informix Unicode database is not supported
in PB10.
15PowerScript
- Data types
- Functions to manipulate ANSI Unicode string
- Functions to process Unicode files
16PowerScript Data types
- String
- String will always be a Unicode string. All data
in a String will be Unicode. No ANSI String any
more. - Multilanguage characters are possible to put in
one PB string.
- Blob
- Blob remains as a binary data type. It could
store binary data, ANSI characters, or Unicode
characters. - How?
17Conversion between String and Blobin PowerScript
- Conversion from Blob to String
- String ( blob, Encoding )
- Convert a Blob to a String
- Encoding could be EncodingANSI!, Encoding UTF8!
, EncodingUTF16LE! And EncodingUTF16BE!. The
default is EncodingUTF16LE!.
- Conversion from String to Blob
- Blob ( string, Encoding )
- Convert a String to a Blob
- Encoding could be EncodingANSI!, Encoding UTF8!
, EncodingUTF16LE! And EncodingUTF16BE! . The
default is EncodingUTF16LE!.
- Other Conversion Functions
- FromANSI()/ToANSI()/FromUnicode()/ToUnicode() are
still supported, but obsolete, in PB10. We
encourage users to shift to String/Blob
functions.
18PowerScript Functions to manipulate ANSI
Unicode string
- Len/Left/Mid/Right/
- These functions are Unicode character based.
- LenW/LeftW/MidW/RightW/
- All W functions are also Unicode character
based. - Same as Len Functions
- LenA/LeftA/MidA/RightA/
- A new set (A) of functions is added for string
manipulation by byte. PB will convert the PB
String (Unicode) to DBCS (based on machines
locale), then apply the operation.
- Migration tool help identify/replace these
functions.
19PowerScript Functions to process Unicode files
- File Types
- ANSI/DBCS files
- Unicode (UTF16/UTF8) files (New)
- Binary files
- File Operation Functions
- FileEncoding(filename)
- FileOpen(filename ,filemode ,fileaccess
,filelock ,writemode ,Encoding) - Filemode LineMode, StreamMode, TextMode
- FileRead/FileWrite ---- Read/Write in 32765 chunk
- FileReadEX/FileWriteEx ---- Read/Write a file
- FileSeek/FileSeek64
- Encoding
- EncodingANSI!, Encoding UTF8! , EncodingUTF16LE!
And EncodingUTF16BE! - Conversion
- When Read/Write, Conversion will take place if
needed
20PowerScript Examples
- Read an Ansi File
- Integer li_FileNum
- String s_rec
- li_FileNum FileOpen("Employee.txt")
- // or li_FileNum FileOpen("Emplyee.txt",
TextMode!) - FileRead(li_FileNum, s_rec)
- Read a Unicode File
- Integer li_FileNum
- String s_rec
- li_FileNum FileOpen("EmployeeU.txt", TextMode!,
Read!, EncodingUTF16LE!) - FileRead(li_FileNum, s_rec)
- Read a Binary File
- Integer li_FileNum
- blob bal_rec
- li_FileNum FileOpen("Employee.imp, Stream
Mode!, Read!) - FileRead(li_FileNum, bal_rec)
21DataWindow
- DataWindow support Multilanguage display and
manipulation in PB10.
- DW string related functions are changed to be
consistent to PowerScript functions - DW file manipulation functions extended to
Unicode files also.
22JSP Authoring Web Services
- JSP
- JSP Authoring tool also is Unicode enabled in
PB10. - The users have the choice to save the JSP files
in different format. (Unicode, UTF8, ANSI are
supported). - JSP page can process ANSI Unicode request.
- Web Services
- In PB10, Web Services client can handle
international characters.
23XML DataWindow
- XML DataWindow
- Process ANSI Unicode data.
- Tips
- lt_at_ page contentType"text/html charsetUTF-8"
gt - request.setCharacterEncoding("UTF8")
24PBNI ORCA APIs
- PBNI
- In PB10, PBNI offers 2 sets of APIs One is for
ANSI, the other is for Unicode. So the users can
develop PB extensions using ANSI build or Unicode
build as they like. - PBNI has templates for users to use.
- ORCA
- PB10 offers 2 sets of APIs(ANSI Unicode) for C
functions to extract object form PBL or construct
PBL from Object file.
25External Function Call
- Purpose
- You can define PB global or local functions to
map to external function call to system or 3rd
party Dlls - Change of the Syntax
- In PB9 and before the syntax is
- FUNCTION int MessageBoxA(int handle, string
content, string title, int show type) - LIBRARY "user32.dll"
- In PB10,
- FUNCTION int MessageBox(int handle, string
content, string title, int showtype) - LIBRARY "user32.dll" ALIAS FOR "MessageBoxAansi
- -- use ansi version of system function
- FUNCTION int MessageBox(int handle, string
content, string title, int showtype) - LIBRARY "user32.dll" ALIAS FOR MessageBoxW
- -- use Unicode version of system function
- Migration tool will help identify and replace the
external function call in your existing
application.
26Summary
- PB10 supports Multilanguage process natively.
- PB10 supports both ANSI and Unicode databases.
- PowerScript can handle Ansi, Unicode, and binary
files. - DataWindow and XML DataWindow can process
Multilanguage as well. - JSP supports Multilanguage editing and
deployment. - Through PBNI, the users have the flexibility to
develop Ansi extension or Unicode extension. - PB10 can integrate Ansi and Unicode Dlls into
PB10.
27Q A