Message 374509 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	corona10
Recipients	Jim.Jewett, berker.peksag, corona10, gvanrossum, serhiy.storchaka, xtreak
Date	2020-07-28.16:35:08
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1595954108.7.0.337873239211.issue40841@roundup.psfhosted.org>
In-reply-to

Content
> I think that both functions for detecting file type, by name and by content I think so too, mime sniffing would not be a way to alternate the method based on the file extension. Both APIs should be provided. > should not we add also the code for detecting the text encoding based on other algorithms used in browsers I already add the code for text encoding detection based on the whatwg standard so if this API is landed, yes text encoding detection will be supported.(e.g utf-16be) IMHO, there would be use-cases since today python is used a lot for text data handling (for example crawling, data pre-processing) There would be the question that the standard for the browser is appropriate for the python stdlib module. My answer is that the whatwg standard could be the one of best standards to follow if make the decision to provide mime sniffing. The standard handle mime types that are widely used in the real world not only for browser but also HTTP server or else. One of the big stress to maintain mime-types detection is that considering how many mime-types should be supported. Luckily, whatwg can be the strong standard to make the decision.

> I think that both functions for detecting file type, by name and by content

I think so too, mime sniffing would not be a way to alternate the method based on the file extension. Both APIs should be provided.

> should not we add also the code for detecting the text encoding based on other algorithms used in browsers

I already add the code for text encoding detection based on the whatwg standard so if this API is landed, yes text encoding detection will be supported.(e.g utf-16be)
IMHO, there would be use-cases since today python is used a lot for text data handling (for example crawling, data pre-processing) 

There would be the question that the standard for the browser is appropriate for the python stdlib module.
My answer is that the whatwg standard could be the one of best standards to follow if make the decision to provide mime sniffing.

The standard handle mime types that are widely used in the real world not only for browser but also HTTP server or else.

One of the big stress to maintain mime-types detection is that considering how many mime-types should be supported.
Luckily, whatwg can be the strong standard to make the decision.

History
Date	User	Action	Args
2020-07-28 16:35:08	corona10	set	recipients: + corona10, gvanrossum, berker.peksag, Jim.Jewett, serhiy.storchaka, xtreak
2020-07-28 16:35:08	corona10	set	messageid: <1595954108.7.0.337873239211.issue40841@roundup.psfhosted.org>
2020-07-28 16:35:08	corona10	link	issue40841 messages
2020-07-28 16:35:08	corona10	create