Issue45461
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-10-13 14:31 by anmyachev, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
test.py | anmyachev, 2021-10-13 14:31 | test.py - reproducer |
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 28939 | merged | serhiy.storchaka, 2021-10-13 21:31 | |
PR 28943 | merged | miss-islington, 2021-10-14 10:17 | |
PR 28945 | merged | serhiy.storchaka, 2021-10-14 10:56 |
Messages (7) | |||
---|---|---|---|
msg403837 - (view) | Author: Anatoly Myachev (anmyachev) | Date: 2021-10-13 14:31 | |
Expected behavior - if `read()` function works correctly, then `readline()` should also works. Reproducer in file - just run: `python test.py`. Traceback (most recent call last): File "test.py", line 11, in <module> f.readline() File "C:\Users\amyachev\Miniconda3\envs\modin\lib\encodings\unicode_escape.py", line 26, in decode return codecs.unicode_escape_decode(input, self.errors)[0] UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 8191: \ at end of string |
|||
msg403838 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2021-10-13 14:41 | |
Can you please try write a simpler (shorter) reproducer? |
|||
msg403840 - (view) | Author: Anatoly Myachev (anmyachev) | Date: 2021-10-13 14:55 | |
Hello! I can reduce it a little. The buffer shoudln't be decreased, as it seems there is a some kind relation with the buffer size for IO operations. buffer = b'col1,col2,col3,col4,col5,col6\\r\\n0,2000-01-01,0,00:00:00,DuBFsyerJU,1809.3924826424557\\r\\n10,2000-01-01,10,01:00:00,AlwGHbVPpB,2853.2392617952996\\r\\n20,2000-01-01,20,02:00:00,TEkGgsYXYz,9933.278931158615\\r\\n30,2000-01-01,30,03:00:00,tfvnynVSfp,8574.917426248916\\r\\n40,2000-01-01,40,04:00:00,YOGjhztMWe,3768.71871233428\\r\\n50,2000-01-01,50,05:00:00,vkTOJSeQmU,6330.252072351792\\r\\n60,2000-01-01,60,06:00:00,LeolDfaGyv,5052.618993456892\\r\\n70,2000-01-01,70,07:00:00,OcyrbYVtyr,4287.371622852719\\r\\n80,2000-01-01,80,08:00:00,VUwDPNhcFV,3589.697826814614\\r\\n90,2000-01-01,90,09:00:00,KOadtzcNyK,4794.158259020925\\r\\n100,2000-01-01,100,10:00:00,rdSOjXJBWC,8826.736894397129\\r\\n110,2000-01-01,110,11:00:00,qzwVBOklhk,8086.105782454443\\r\\n120,2000-01-01,120,12:00:00,UTRlqVfKoD,1012.5061461339624\\r\\n130,2000-01-01,130,13:00:00,wKqEkRhkfw,2511.3137510933934\\r\\n140,2000-01-01,140,14:00:00,LxklWJbgxo,406.7116346419042\\r\\n150,2000-01-01,150,15:00:00,SxmZkdUgHv,8424.978062284761\\r\\n160,2000-01-01,160,16:00:00,nEvzypASGb,9890.252156059063\\r\\n170,2000-01-01,170,17:00:00,xiFkkjoDPB,2728.8359201479675\\r\\n180,2000-01-01,180,18:00:00,boMmgpBXgL,4231.680208002166\\r\\n190,2000-01-01,190,19:00:00,dXLJXWiXZI,7757.44902751916\\r\\n200,2000-01-01,200,20:00:00,PBdjwKoCMD,4915.090357003991\\r\\n210,2000-01-01,210,21:00:00,zGWLALpmoA,359.5243650158153\\r\\n220,2000-01-01,220,22:00:00,CfpZJoOqGZ,704.7990862762942\\r\\n230,2000-01-01,230,23:00:00,DrkxpLhpEN,520.3290677592321\\r\\n240,2000-01-02,240,00:00:00,TDKEBbZAzQ,5218.671660857721\\r\\n250,2000-01-02,250,01:00:00,gULwzvNeWO,4218.66872701774\\r\\n260,2000-01-02,260,02:00:00,ogSyzHWmNY,9026.657391329585\\r\\n270,2000-01-02,270,03:00:00,NetmmthtzN,2027.8312539582244\\r\\n280,2000-01-02,280,04:00:00,PoYiHipTzR,7667.627476518046\\r\\n290,2000-01-02,290,05:00:00,MjHIRGmsoq,4144.001792539834\\r\\n300,2000-01-02,300,06:00:00,qESRSNnNnO,5348.024681284471\\r\\n310,2000-01-02,310,07:00:00,sSIjcXWhLC,3622.4673907599413\\r\\n320,2000-01-02,320,08:00:00,IvjrlljbeB,7500.419388155823\\r\\n330,2000-01-02,330,09:00:00,aVWVRXZjZy,3686.5972529264213\\r\\n340,2000-01-02,340,10:00:00,QKeTjcNlCG,1228.9751449454411\\r\\n350,2000-01-02,350,11:00:00,phEdHCVsbe,4254.15983968718\\r\\n360,2000-01-02,360,12:00:00,ursHJjQxRK,6099.131673115221\\r\\n370,2000-01-02,370,13:00:00,JvjcRlYcYG,1503.3586866746164\\r\\n380,2000-01-02,380,14:00:00,gzCyqHPRRb,7816.898213939008\\r\\n390,2000-01-02,390,15:00:00,lQZmobRwzt,8295.113759829599\\r\\n400,2000-01-02,400,16:00:00,qspiYGfTou,1987.8215069414816\\r\\n410,2000-01-02,410,17:00:00,mcqWMMzomf,15.878728570531964\\r\\n420,2000-01-02,420,18:00:00,fiPsxulpGU,5380.485947841902\\r\\n430,2000-01-02,430,19:00:00,gTAyTkpeez,4720.7159908343565\\r\\n440,2000-01-02,440,20:00:00,hzFbhAPvFX,946.5797295044975\\r\\n450,2000-01-02,450,21:00:00,NYNcYxsyVl,7333.850198973723\\r\\n460,2000-01-02,460,22:00:00,wvgMmIxLzo,7399.341315026157\\r\\n470,2000-01-02,470,23:00:00,bZoyzAGgEC,5464.053510955946\\r\\n480,2000-01-03,480,00:00:00,jZNaceUYyr,1390.8829937709977\\r\\n490,2000-01-03,490,01:00:00,sbfLgcCpru,9626.900131786555\\r\\n500,2000-01-03,500,02:00:00,MHpAkHfnmV,9406.471079089133\\r\\n510,2000-01-03,510,03:00:00,ENdFBGtRCq,3740.8773019724517\\r\\n520,2000-01-03,520,04:00:00,FzqXhMLHLY,4270.3585910905\\r\\n530,2000-01-03,530,05:00:00,wWinjEGhAj,8548.152649813675\\r\\n540,2000-01-03,540,06:00:00,LcxAImCvxt,4097.693176523874\\r\\n550,2000-01-03,550,07:00:00,sDhzGBYKpt,1673.7466277500146\\r\\n560,2000-01-03,560,08:00:00,jhagjcZhGU,4103.702089490347\\r\\n570,2000-01-03,570,09:00:00,ZIkRwPWyWP,9368.662605679918\\r\\n580,2000-01-03,580,10:00:00,uphgoCQwZY,3321.0096306747137\\r\\n590,2000-01-03,590,11:00:00,jEKaqqScLF,8442.084614664149\\r\\n600,2000-01-03,600,12:00:00,kSIJFBHVnL,4065.19226287942\\r\\n610,2000-01-03,610,13:00:00,YRhoANskYn,5089.668482943252\\r\\n620,2000-01-03,620,14:00:00,SnlwCSdkWf,5738.46737129545\\r\\n630,2000-01-03,630,15:00:00,ANfpLOiJTV,393.77545256928823\\r\\n640,2000-01-03,640,16:00:00,DUxigzNtLz,6798.725575133883\\r\\n650,2000-01-03,650,17:00:00,jaJECwmWTY,5178.597327486391\\r\\n660,2000-01-03,660,18:00:00,tzrWZLSELo,7467.995039288831\\r\\n670,2000-01-03,670,19:00:00,rbUWLCKjeV,4013.698847016407\\r\\n680,2000-01-03,680,20:00:00,JKFAZgEkja,1538.6412971598695\\r\\n690,2000-01-03,690,21:00:00,uEomQhtneK,2849.6558284053976\\r\\n700,2000-01-03,700,22:00:00,VNqwqzfgXT,6756.852702484582\\r\\n710,2000-01-03,710,23:00:00,YzYqAlWMKn,9250.2543956494\\r\\n720,2000-01-04,720,00:00:00,VBrvxVqNpT,7430.930594705144\\r\\n730,2000-01-04,730,01:00:00,KxgdYwiVtl,1190.2548337790097\\r\\n740,2000-01-04,740,02:00:00,oPUENybUiS,247.4663426770396\\r\\n750,2000-01-04,750,03:00:00,bgpLfCsNrU,6472.8593061097\\r\\n760,2000-01-04,760,04:00:00,xmRUnIzNOL,5791.031151521782\\r\\n770,2000-01-04,770,05:00:00,SsYMDEINvO,347.35344936110636\\r\\n780,2000-01-04,780,06:00:00,XuorBLXsEt,9003.971751685769\\r\\n790,2000-01-04,790,07:00:00,jRYnFPYRKE,858.8836157464275\\r\\n800,2000-01-04,800,08:00:00,uRRXIdQDYH,4914.608250347407\\r\\n810,2000-01-04,810,09:00:00,nxkVSEnKXv,3586.0998633311424\\r\\n820,2000-01-04,820,10:00:00,BddLdFLDkg,9392.836980063128\\r\\n830,2000-01-04,830,11:00:00,MNuZvbMDqM,4075.512732895953\\r\\n840,2000-01-04,840,12:00:00,KfiIyqdZJq,4450.624248264806\\r\\n850,2000-01-04,850,13:00:00,ZNzdZZhipO,5155.329570863023\\r\\n860,2000-01-04,860,14:00:00,MmVEuWyJJt,7125.153628136557\\r\\n870,2000-01-04,870,15:00:00,QTVeqONJWF,7459.723393845693\\r\\n880,2000-01-04,880,16:00:00,sVHRlErfHm,5349.520468668593\\r\\n890,2000-01-04,890,17:00:00,OfcunHkqxU,2538.9594014567383\\r\\n900,2000-01-04,900,18:00:00,rXTISMpGvf,6136.26826553925\\r\\n910,2000-01-04,910,19:00:00,YYgIQPrYmN,2828.778965008356\\r\\n920,2000-01-04,920,20:00:00,acLWVYscRm,2135.4492617161204\\r\\n930,2000-01-04,930,21:00:00,ejuIuzrhoE,7853.20277523869\\r\\n940,2000-01-04,940,22:00:00,nEIyUKZvtl,9026.298438227512\\r\\n950,2000-01-04,950,23:00:00,fVrPrRMjgE,1108.9112508806\\r\\n960,2000-01-05,960,00:00:00,aQbeIHZfrq,6779.761579736982\\r\\n970,2000-01-05,970,01:00:00,NSYmULwYsy,4710.484556444787\\r\\n980,2000-01-05,980,02:00:00,OstJdNkpJM,6696.018116272272\\r\\n990,2000-01-05,990,03:00:00,zPdwVSfwsw,1019.0631993852805\\r\\n1000,2000-01-05,1000,04:00:00,PrPiNtxItj,4786.919229745998\\r\\n1010,2000-01-05,1010,05:00:00,iTrMpbwDkd,1082.2792701135043\\r\\n1020,2000-01-05,1020,06:00:00,VIOGBhjuvc,6712.260837571906\\r\\n1030,2000-01-05,1030,07:00:00,vKfivaIyHN,8660.527086155422\\r\\n1040,2000-01-05,1040,08:00:00,bAlxEIEfpN,1415.7747325826188\\r\\n1050,2000-01-05,1050,09:00:00,cJPGJmIKdc,9816.3246377919\\r\\n1060,2000-01-05,1060,10:00:00,AdSXaKQpQX,3536.32709953549\\r\\n1070,2000-01-05,1070,11:00:00,PHntAagAlw,7431.850668273714\\r\\n1080,2000-01-05,1080,12:00:00,ZtQrFBobvY,4224.027690860892\\r\\n1090,2000-01-05,1090,13:00:00,ZuPnbhaSOU,3484.8530656320654\\r\\n1100,2000-01-05,1100,14:00:00,qOSVmejqdo,6847.384220484392\\r\\n1110,2000-01-05,1110,15:00:00,kwckywqRbb,5867.829131220223\\r\\n1120,2000-01-05,1120,16:00:00,JLrzzbUfDi,6991.180870142121\\r\\n1130,2000-01-05,1130,17:00:00,qPuDjhipNE,2544.115558392327\\r\\n1140,2000-01-05,1140,18:00:00,nTuOipVPUZ,3521.350549002792\\r\\n1150,2000-01-05,1150,19:00:00,FxTDpmsUYC,5796.837844528479\\r\\n1160,2000-01-05,1160,20:00:00,IilnnODeoz,9981.446352555968\\r\\n1170,2000-01-05,1170,21:00:00,lJpBtcVSww,8659.609927822496\\r\\n1180,2000-01-05,1180,22:00:00,uefmaifDgk,164.5549179029382\\r\\n1190,2000-01-05,1190,23:00:00,AQsKnkJxOV,455.31829622753816\\r\\n1200,2000-01-06,1200,00:00:00,IUcDyPSHIE,5727.976331105652\\r\\n1210,2000-01-06,1210,01:00:00,nrEdNiWGdi,2015.5167059418156\\r\\n1220,2000-01-06,1220,02:00:00,EflmCojQzg,9514.004760633412\\r\\n1230,2000-01-06,1230,03:00:00,LsAIvtooWr,7898.8225145572\\r\\n1240,2000-01-06,1240,04:00:00,yiDOUysGHw,4219.262059231663\\r\\n1250,2000-01-06,1250,05:00:00,idWAZATxwy,3043.2304072778616\\r\\n1260,2000-01-06,1260,06:00:00,sBedlknKzY,3840.820372936372\\r\\n1270,2000-01-06,1270,07:00:00,ReEmhVRAjb,6966.434389542963\\r\\n1280,2000-01-06,1280,08:00:00,XnFrfzMBKt,6041.8596064524045\\r\\n1290,2000-01-06,1290,09:00:00,MaMMHEWEIf,2569.2675325271707\\r\\n1300,2000-01-06,1300,10:00:00,OUpokSyVfO,7387.813510302333\\r\\n1310,2000-01-06,1310,11:00:00,VgCigxOcbF,7695.008235452545\\r\\n1320,2000-01-06,1320,12:00:00,ouRNYgSzXq,3293.250454887212\\r\\n1330,2000-01-06,1330,13:00:00,iQczJExipS,1892.9945453269115\\r\\n1340,2000-01-06,1340,14:00:00,vVbLlDWFCr,7105.276586964716\\r\\n1350,' with open("bug_csv.csv", "wb") as f: f.write(buffer) with open("bug_csv.csv", encoding="unicode_escape", newline="") as f: f.readline() |
|||
msg403848 - (view) | Author: Matthew Barnett (mrabarnett) * ![]() |
Date: 2021-10-13 16:25 | |
It can be shortened to this: buffer = b"a" * 8191 + b"\\r\\n" with open("bug_csv.csv", "wb") as f: f.write(buffer) with open("bug_csv.csv", encoding="unicode_escape", newline="") as f: f.readline() To me it looks like it's reading in blocks of 8K and then decoding them, but it isn't correctly handling an escape sequence that happens to cross a block boundary. |
|||
msg403892 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2021-10-14 10:17 | |
New changeset c96d1546b11b4c282a7e21737cb1f5d16349656d by Serhiy Storchaka in branch 'main': bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) https://github.com/python/cpython/commit/c96d1546b11b4c282a7e21737cb1f5d16349656d |
|||
msg403919 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2021-10-14 17:02 | |
New changeset 0bff4ccbfd3297b0adf690655d3e9ddb0033bc69 by Miss Islington (bot) in branch '3.10': [3.10] bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) (GH-28943) https://github.com/python/cpython/commit/0bff4ccbfd3297b0adf690655d3e9ddb0033bc69 |
|||
msg403920 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2021-10-14 17:03 | |
New changeset 7c722e32bf582108680f49983cf01eaed710ddb9 by Serhiy Storchaka in branch '3.9': [3.9] bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" codec (GH-28939) (GH-28945) https://github.com/python/cpython/commit/7c722e32bf582108680f49983cf01eaed710ddb9 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:51 | admin | set | github: 89624 |
2021-10-14 17:51:53 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2021-10-14 17:03:33 | serhiy.storchaka | set | messages: + msg403920 |
2021-10-14 17:02:29 | serhiy.storchaka | set | messages: + msg403919 |
2021-10-14 10:56:35 | serhiy.storchaka | set | pull_requests: + pull_request27233 |
2021-10-14 10:30:09 | serhiy.storchaka | link | issue45467 dependencies |
2021-10-14 10:17:32 | serhiy.storchaka | set | messages: + msg403892 |
2021-10-14 10:17:21 | miss-islington | set | nosy:
+ miss-islington pull_requests: + pull_request27231 |
2021-10-13 21:31:05 | serhiy.storchaka | set | keywords:
+ patch stage: patch review pull_requests: + pull_request27228 |
2021-10-13 18:45:55 | serhiy.storchaka | set | assignee: serhiy.storchaka versions: + Python 3.9, Python 3.10, Python 3.11, - Python 3.8 |
2021-10-13 17:08:24 | vstinner | set | nosy:
+ serhiy.storchaka |
2021-10-13 16:25:23 | mrabarnett | set | nosy:
+ mrabarnett messages: + msg403848 |
2021-10-13 14:55:06 | anmyachev | set | messages: + msg403840 |
2021-10-13 14:41:20 | vstinner | set | messages: + msg403838 |
2021-10-13 14:31:37 | anmyachev | create |