This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: uuid constructor accept invalid strings (extra dash)
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Cédric Cabessa, Windson Yang, josh.r, taleinat
Priority: normal Keywords:

Created on 2019-04-30 09:19 by Cédric Cabessa, last changed 2022-04-11 14:59 by admin.

Messages (5)
msg341141 - (view) Author: Cédric Cabessa (Cédric Cabessa) Date: 2019-04-30 09:19
UUID constructor accept string with too many dashes or keyword like urn: / uuid:

For eg, this code do not raise

```
>>> import uuid
>>> uuid.UUID('0be--468urn:urn:urn:urn:54-4bf9-41----------d4-9697-41d735uuid:4fbe85uuid:')
UUID('0be46854-4bf9-41d4-9697-41d7354fbe85')
```

For the context, we use a validator based on `uuid.UUID` for an API.
Some customer send string with a UUID followed by extra `-`, the validator let it pass but the sql connector raise an exception

We workaround this in our validator, but UUID constructor should not accept string like the one in exemple
msg341154 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-04-30 15:32
The documentation does describe a fairly flexible parser. Perhaps it's a little too flexible on stuff like URN prefixes, but I don't think we could start enforcing a stricter class of hyphen separations without potentially breaking existing code.

Is there are reason your validator doesn't use uuid.UUID to normalize the value? That is, whatever the customer provides, why not use the result of stringifying the resulting UUID, rather than just convert to UUID to validate, then throwing it away? As long as the result is compatible with your sql connector, and logically equivalent to what the customer provided, that seems a valid solution.
msg341216 - (view) Author: Cédric Cabessa (Cédric Cabessa) Date: 2019-05-01 17:24
> Is there are reason your validator doesn't use uuid.UUID to normalize the value? That is, whatever the customer provides, why not use the result of stringifying the resulting UUID

Yes, this is exactly what we do now

However this behaviour is a bit surprising, we were thinking that uuid.UUID could be used as a validator

I understand the risk of breaking existing code with a fix that enforce a strict form.

Maybe a line should be added in the documentation to prevent people using this as a validator without more check?

But ok, you can close the bug if you prefer ... I think there no perfect solution :-)
msg341238 - (view) Author: Windson Yang (Windson Yang) * Date: 2019-05-02 00:36
> Maybe a line should be added in the documentation to prevent people using this as a validator without more check?

I don't expect uuid.UUID could be used as a validator myself, but I agreed we can warn users in the documentation if lots of them confuse about it.
msg349045 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2019-08-05 10:21
I too find this surprising, especially given how thoroughly UUID validates inputs of types other than "hex".

The documentation simply states that for hex input, hypens, curly braces and a URN prefix are optional.  In practice, though, it is much more lenient than that, as described here.

Since the UUID has no expert listed, we'll have to decide whether to make the input validation stricter and break backwards-compatibility, or simply make the docs clearer.  Clarifying the docs certainly seems simpler, safer and more user-friendly.  It also seems reasonable, given that this issue apparently hasn't affected many users.
History
Date User Action Args
2022-04-11 14:59:14adminsetgithub: 80938
2019-08-05 10:21:04taleinatsetversions: - Python 3.5, Python 3.6
nosy: + taleinat

messages: + msg349045

type: behavior
2019-05-02 00:36:23Windson Yangsetnosy: + Windson Yang
messages: + msg341238
2019-05-01 17:24:56Cédric Cabessasetmessages: + msg341216
2019-04-30 15:32:48josh.rsetnosy: + josh.r
messages: + msg341154
2019-04-30 09:19:48Cédric Cabessacreate