ETSI's Bug Tracker - Part 01: TTCN-3 Core Language |
View Issue Details |
|
ID | Project | Category | View Status | Date Submitted | Last Update |
0006789 | Part 01: TTCN-3 Core Language | New Feature | public | 23-07-2014 10:51 | 06-01-2015 18:26 |
|
Reporter | Gyorgy Rethy | |
Assigned To | Gyorgy Rethy | |
Priority | normal | Severity | minor | Reproducibility | have not tried |
Status | closed | Resolution | fixed | |
Platform | | OS | | OS Version | |
Product Version | v4.6.1 (published 2014-06) | |
Target Version | v4.7.1 (published 2015-06) | Fixed in Version | v4.7.1 (published 2015-06) | |
Clause Reference(s) | 6.1.1 e) |
Source (company - Author) | L.M.Ericsson |
|
Summary | 0006789: Allow non-ISO646 (UTF-8 compatible) characters in universal charstrings |
Description | Nowadays more and more protocols are textual or carrying textual content and do allow non-ASCII characters. For example both XML and JSON allow non-ISO646 characters.
As users need to work more and more with unicode characters, it raises the requirement to be able to enter non-ASCII characters in unicode charstring values directly, instead of using the cumbersome quadruple notation. Using the new, U-coded unicode character reference (see CR6727), makes the syntax better, but doesn't solve the problem completely. The two syntaxes would complement each other: not all unicode characters are supported by UTF-8, not all text editors can represent all UTF-8 graphical characters and some users may want to stick to using ISO646 characters e.g. for backward compatibility reasons.
As TTCN-3 modules shall be stored in UTF-8 (see clause 8), this would not allow misreading the values (to 2, 3 or 4 separate characters), when transferring the code between tools (though an elder tool may interpret the code as being erroneous, if version is not identified in the module header). |
Steps To Reproduce | |
Additional Information | |
Tags | No tags attached. |
Relationships | |
Attached Files | draft-res-6789-v1.docx (14,935) 04-11-2014 13:58 http://oldforge.etsi.org/mantis/file_download.php?file_id=3158&type=bug draft-res-6789-v2.docx (25,191) 06-11-2014 15:09 http://oldforge.etsi.org/mantis/file_download.php?file_id=3178&type=bug |
|
Issue History |
Date Modified | Username | Field | Change |
23-07-2014 10:51 | Gyorgy Rethy | New Issue | |
06-10-2014 10:55 | Gyorgy Rethy | Note Added: 0012224 | |
06-10-2014 10:55 | Gyorgy Rethy | Target Version | => v4.7.1 (published 2015-06) |
09-10-2014 13:58 | Jacob Wieland - Spirent | Note Added: 0012323 | |
03-11-2014 16:33 | Gyorgy Rethy | Assigned To | => Axel Rennoch |
03-11-2014 16:33 | Gyorgy Rethy | Status | new => assigned |
04-11-2014 13:58 | Axel Rennoch | File Added: draft-res-6789-v1.docx | |
04-11-2014 14:00 | Axel Rennoch | Note Added: 0012401 | |
06-11-2014 08:58 | Gyorgy Rethy | Note Added: 0012446 | |
06-11-2014 15:09 | Axel Rennoch | File Added: draft-res-6789-v2.docx | |
06-11-2014 15:11 | Axel Rennoch | Note Added: 0012472 | |
06-11-2014 15:12 | Axel Rennoch | Note Added: 0012473 | |
06-11-2014 15:12 | Axel Rennoch | Assigned To | Axel Rennoch => Gyorgy Rethy |
06-11-2014 15:12 | Axel Rennoch | Status | assigned => acknowledged |
07-11-2014 11:50 | Gyorgy Rethy | Status | acknowledged => confirmed |
07-11-2014 13:37 | Jacob Wieland - Spirent | Note Added: 0012498 | |
06-01-2015 18:23 | Gyorgy Rethy | Status | confirmed => resolved |
06-01-2015 18:23 | Gyorgy Rethy | Resolution | open => fixed |
06-01-2015 18:26 | Gyorgy Rethy | Note Added: 0012649 | |
06-01-2015 18:26 | Gyorgy Rethy | Status | resolved => closed |
06-01-2015 18:26 | Gyorgy Rethy | Fixed in Version | => v4.7.1 (published 2015-06) |
Notes |
|
|
|
|
|
As the only escape-character in TTCN-3 charstring literals is the quote-symbol, I guess this would have to be used.
"aaa"0706"bbb", for instance, could then be the same as "aaa" & <unicode of 0706> & "bbb".
As far as I can see, this does not introduce any backward incompatiblity as there is at the moment no grammar rule which allows a number directly behind a charstring literal. |
|
|
|
Based on Jacob's idea we may allow different representations, please see examples in the attachment, since characters do not appear in this box. ;-) |
|
|
|
We shall not extend the scope of the CR. If more/other feature is needed, another CR shall be submitted.
The standard specifies the TTCN-3 modules to be saved in UTF-8, TTCN-3 editors should support UTF-8 characters (at least a reasoable subset), because they are allowed in comments. So, in principle no technical difficulties to allow their direct use in universal charstring values as well.
The additional syntax brings in new problems:
- in case of "aaa"0706"bbb", how to know what the user wanted to write? it may be a simple typing error and he/she meant "aaa""0706""bbb"! For this reason I strongly oppose this syntax, i.e. to extend the smantics associated with the "
character.
Anyway, UTF-8 today covers wast majority of really used characters, therefore the char(U4E2D, U56FD) syntax will become rarely used or used due to local style guides. |
|
|
|
Following the discussion only a note has been added in the attached file to 6.1.1 e). |
|
|
|
Please advice if the new note is sufficient. |
|
|
|
no problem with me. I just pointed out that a direct inclusion into the charstring literals needs to use the " character as that is the only escape character we have (unfortunately). In principle, I agree with Gyorgys reasoning that mostly, their will be no need for the char-syntax to be used.
The only exception I see is standardization bodies which want to publish their testsuites in a non-UTF8 based format. |
|
|
|
|