Message401078
I noticed that when using the Unicode character \U00010900 when inserting the character as character:
Here is the result on the Python console both for 3.6 and 3.9:
```
>>> s = '0𐤀00'
>>> s
'0𐤀00'
>>> ls = list(s)
>>> ls
['0', '𐤀', '0', '0']
>>> s[0]
'0'
>>> s[1]
'𐤀'
>>> s[2]
'0'
>>> s[3]
'0'
>>> ls[0]
'0'
>>> ls[1]
'𐤀'
>>> ls[2]
'0'
>>> ls[3]
'0'
```
It appears that for some reason in this specific case the character is actually stored in a different position that shown when printing the complete string. Note that the string is already behaving strange when marking it in the console. When marking the special character it directly highlights the last 3 characters (probably because it already thinks this character is in the second position).
The same behavior does not occur when directly using the unicode point
```
>>> s='000\U00010900'
>>> s
'000𐤀'
>>> s[0]
'0'
>>> s[1]
'0'
>>> s[2]
'0'
>>> s[3]
'𐤀'
```
This was tested using the following Python versions:
```
Python 3.6.0 (default, Dec 29 2020, 02:18:14)
[GCC 10.2.1 20201125 (Red Hat 10.2.1-9)] on linux
Python 3.9.6 (default, Jul 16 2021, 00:00:00)
[GCC 11.1.1 20210531 (Red Hat 11.1.1-3)] on linux
```
on Fedora 34 |
|
Date |
User |
Action |
Args |
2021-09-05 11:12:09 | maxbachmann | set | recipients:
+ maxbachmann, vstinner, ezio.melotti |
2021-09-05 11:12:09 | maxbachmann | set | messageid: <1630840329.53.0.683865786934.issue45105@roundup.psfhosted.org> |
2021-09-05 11:12:09 | maxbachmann | link | issue45105 messages |
2021-09-05 11:12:09 | maxbachmann | create | |
|