classification
Title: Adding a git developer's guide to Mercurial to devguide
Type: enhancement Stage: resolved
Components: Devguide Versions:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, brett.cannon, demian.brecht, ezio.melotti, pitrou, python-dev
Priority: normal Keywords: patch

Created on 2014-12-04 00:09 by demian.brecht, last changed 2015-01-16 18:00 by brett.cannon. This issue is now closed.

Files
File name Uploaded Description Edit
gitdevs.patch demian.brecht, 2014-12-04 00:09
issue22992.patch demian.brecht, 2014-12-05 15:56
issue22992_1.patch demian.brecht, 2014-12-11 01:02
Messages (21)
msg232100 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-04 00:09
Coming out of a recent thread in python-dev, it was mentioned that adding a git developer's guide to mercurial may be beneficial to help lower the initial barrier of entry for prospective contributors (https://mail.python.org/pipermail/python-dev/2014-November/137280.html). The attached patch is a slightly reworded version of my blog post here: http://demianbrecht.github.io/vcs/2014/07/31/from-git-to-hg/.
msg232101 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2014-12-04 01:00
I have a few comments about the patch.

About the markup:
1) you can specify the default highlight (bash) once at the top of the file, and just use :: afterwards instead of .. code-block::;
2) the ```` used for some headers is inconsistent with the other files;
3) when you use NamedBranches, do you want to create a link? if so you should use :ref:`NamedBranches` and add a reference;

About the content:
1) you should make clear that bookmarks are local and they won't be included in the patch and/or affect the main repo;
2) we usually want a single diff for each issue -- you should show how to create a diff that includes all the changes related to issueA or issueB (this can be done easily with named branches, not sure about bookmarks);
3) hg ci --amend could be used to update existing changesets after a review (this will also solve the above problem);
4) it should be possible to use evolve instead of bookmarks, but evolve is currently still a work in progress, so I'm not sure is worth mentioning it yet;
5) you said that named branches, bookmarks, and queues are all branches types in Mercurial.  Can you add a source for that?  (I never heard of bookmarks and queues as branches, but they might be, and there are also non-named branches.  I just want to make sure that the terms are used correctly.);
6) when you mention :code:`hg log -G`, you should probably mention -L10 too, otherwise on the CPython repo the command will result in a graph with 90k+ changesets;
7) "Note that Mercurial doesn't have git's concept of staging, so all changes will be committed." <- To avoid committing there are different ways: a) leave the code uncommitted in the working copy (works with one or two unrelated issues, but doesn't scale well); b) save changes on a .diff and apply it when needed (what I usually do, scales well but it has a bit of overhead since you need to update/apply the diff); c) use mq; d) use evolve.  Mercurial also has the concept of phases (http://mercurial.selenic.com/wiki/Phases) that might be somewhat related, even though it affected committed changes (if you set a cs as secret it won't be pushed);
8) Trying to learn usual hg workflows might also be easier than trying to use a git-like workflow on mercurial, so you could suggest considering them too (I don't remember if we have anything about it in the devguide though).
msg232102 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-12-04 03:56
+The workflow of a developer might look something like this:

"a core developer" or "a contributor"? It would be good to split core developer and contributor workflows.

+    # address review comments and merge
+    git checkout master
+    git merge issueA
+    git branch -d issueA

For example, from the contributor side, you shouldn't touch the master branch at all.

It should be:

    # address review comments
    git commit -a
    # check upstream and rebase
    git pull --rebase upstream master  # or "origin master" in this example
    # push changes (optional)
    git push origin issueA  # origin should be contributor's fork here
msg232141 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-04 16:06
Thanks for the comments, will try to have an updated patch today.

> Trying to learn usual hg workflows might also be easier than trying to use a git-like workflow on mercurial, so you could suggest considering them too (I don't remember if we have anything about it in the devguide though).

(For other as we briefly chatted about this over IM:) The intention of this guide was to minimize the cognitive load of context switching between git and mercurial projects for git developers just beginning to use mercurial. As their experience with mercurial increases, their workflow may change to be more of an "out of the box" type flow.

Git developers (myself included) will likely spend a good deal of time trying to figure out "how to do X as I would in git using Mercurial". This is simply to try and lower the initial cost of development for new contributors.
msg232194 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-05 15:56
I believe most points should be addressed in the updated patch. Below are responses to other comments (those not included have resulted in changes in the patch).


@Ezio:

> 4) it should be possible to use evolve instead of bookmarks, but evolve is currently still a work in progress, so I'm not sure is worth mentioning it yet;

I don't think that it's worth mentioning at this point given it's in an experimental state.

5) you said that named branches, bookmarks, and queues are all branches types in Mercurial.  Can you add a source for that?

No I can't, I was apparently completely off my face with the wording there :)

7) "Note that Mercurial doesn't have git's concept of staging, so all changes will be committed." [...]

Agreed that there are a number of solutions to this, my intention here is simply to point out a difference that developers coming from a git background will be surprised about. This one caught me a little off guard initially as I personally really like git's staging feature and bank on it at times.

@Berker:

> "a core developer" or "a contributor"?

Great point, thanks for that. My intention was to remove all merging-related operations and tailor it entirely towards contributors. I've reworded passages to make this clear and have removed all steps related to merging.
msg232196 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-12-05 16:09
+    # address review comments and merge

You can remove the "and merge" part.

+    # generate patch for submission
+    git diff master..issueA > issueA.patch

Contributors may need to update their repos before this step. It would be good to add a note about "git pull --rebase".

+    # create patch
+    # git diff master..issueA > issueA.patch
+    hg diff -c issueA > issueA.patch

Same here.
msg232265 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-12-07 01:42
I was going to say I'm skeptical that we need this but the proposed text is well researched and well written (I'll probably learn something about bookmarks myself, actually). Kudos!
msg232389 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-09 16:14
> Contributors may need to update their repos before this step. It would be good to add a note about "git pull --rebase".

Working through this flow actually exposed a blocking issue as far as this guide goes. As I understand it, because Mercurial sends hashes of all topological heads in headers, the size of the headers goes quite quickly as the number of topological heads increases. When the size of the headers exceeds what the server accepts, you run into 400 responses.

Details in a blog post: http://barahilia.github.io/blog/computers/2014/10/09/hg-pull-via-http-fails-400.html

One of the suggested options listed in https://bitbucket.org/site/master/issue/8263/http-400-bad-request-error-when-pulling is to use SSH rather than HTTPS, but is that even an option for non core devs?

I'm investigating other options right now, but thanks for pointing out the merging Berker, it definitely pointed out a flaw in this workflow.
msg232390 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-09 16:36
One option would be to increase LimitRequestFieldSize in Apache, but that really is only a stopgap solution as that limit will be hit again as one's local repo grows over time. It really would be nice to allow read only access via SSH, which isn't affected by the header issue, but I'm pretty sure that would entail a decent amount of work.
msg232393 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-12-09 16:59
Le 09/12/2014 17:36, Demian Brecht a écrit :
> 
> One option would be to increase LimitRequestFieldSize in Apache, but
that really is only a stopgap solution as that limit will be hit again
as one's local repo grows over time. It really would be nice to allow
read only access via SSH, which isn't affected by the header issue, but
I'm pretty sure that would entail a decent amount of work.

Did you actually hit this issue with hg.python.org?
msg232394 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-09 17:12
> Did you actually hit this issue with hg.python.org?

Yes I did, with only one additional head: https://gist.github.com/demianbrecht/f525cceda8a0c99794ae. This would also lead me to believe that the public at large will also start hitting this with the introduction of new named branches.
msg232444 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-10 21:22
After some further investigation, this seems to be what's happening. Note: this is based on a small amount of research and hasn't been validated by any mercurial devs, so it might be missing key pieces: In cases where a local topological head is unknown to a remote, the commit history of that head's branch is sent to the remote server for comparison. This /is/ chunked, but it seems that the number of commits sent per request is hardcoded to 200 (in Mercurial 2.9.2 at any rate). Because 200 commit hashes are added to the header, the size of the header is greater than what is allowed by the app server (I'm seeing > 8k headers when the discovery step runs with my changes) leading to the error responses when using HTTP(S). So, to encounter this issue, the following must be true:

* The user must have a topological head that the remote is unaware of
* The number of commits resulting from a DAG traversal must result in a header that exceeds the allowed limits of the given app server

Because the discovery requests are chunked, if the allowed header size was increased on the app server to account for the worst case scenario (200 commit hashes), this problem should be fixed and not encountered again as I previously thought would be the case.
msg232445 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-12-10 21:25
I suggest you post your findings on the mercurial mailing-list (http://selenic.com/mailman/listinfo/mercurial, there's a gmane mirror also). I would be surprised if there weren't an easy way of circumventing it.
msg232449 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-10 22:11
The issue and solution was confirmed on IRC in #mercurial by mpm (Matt Mackall). According to him, there is also no easy way to change this behaviour from the client. The problem isn't /really/ a Mercurial issue, but an issue with web server specific defaults.

The default for Apache (which is what is being used here, correct?) is 8k. The worst case header size was ~8.4k. So I'd say that upping the limit to 9k or 10k would account for the worst case scenario as well as add a small buffer.
msg232456 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-11 01:02
I've added a section about rebasing and removed the comment about merging. Of course, I'm assuming that the config changes can be made to Apache to allow for this workflow.
msg232510 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-12 05:48
The issue was actually HAProxy rather than Apache (I had misread a line in the documentation so my understanding of Apache header limitations was incorrect). This was fixed by Donald today. If there are no objections, this should be good to merge now.
msg233232 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2014-12-31 09:25
Ping
msg233766 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2015-01-09 16:46
If Antoine or Ezio don't make any more comments or commit this themselves then I will take the patch as-is and commit it next Friday.

Sorry for the delay, Demian, but December is always a slow month due to holidays.
msg233769 - (view) Author: Demian Brecht (demian.brecht) * (Python triager) Date: 2015-01-09 17:02
Thanks for the response Brett and no worries, the delay is totally understandable.
msg234138 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-01-16 17:59
New changeset 51c89a0c30b4 by Brett Cannon in branch 'default':
Issue #22992: A git developer's guide to hg.
https://hg.python.org/devguide/rev/51c89a0c30b4
msg234139 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2015-01-16 18:00
Thanks for the hard work, Demian! I don't remember how often the devguide is updated online, but it should hopefully be no longer than a day (probably more like an hour).
History
Date User Action Args
2015-01-16 18:00:17brett.cannonsetstatus: open -> closed
resolution: fixed
messages: + msg234139

stage: patch review -> resolved
2015-01-16 17:59:31python-devsetnosy: + python-dev
messages: + msg234138
2015-01-09 17:02:23demian.brechtsetmessages: + msg233769
2015-01-09 16:46:59brett.cannonsetmessages: + msg233766
2014-12-31 09:25:18demian.brechtsetmessages: + msg233232
2014-12-12 05:48:32demian.brechtsetmessages: + msg232510
2014-12-11 01:02:24demian.brechtsetfiles: + issue22992_1.patch

messages: + msg232456
2014-12-10 22:11:32demian.brechtsetmessages: + msg232449
2014-12-10 21:25:19pitrousetmessages: + msg232445
2014-12-10 21:22:27demian.brechtsetmessages: + msg232444
2014-12-09 17:12:14demian.brechtsetmessages: + msg232394
2014-12-09 16:59:11pitrousetmessages: + msg232393
2014-12-09 16:36:38demian.brechtsetmessages: + msg232390
2014-12-09 16:14:01demian.brechtsetmessages: + msg232389
2014-12-07 01:42:57pitrousetnosy: + pitrou
messages: + msg232265
2014-12-05 16:09:25berker.peksagsetmessages: + msg232196
2014-12-05 15:56:46demian.brechtsetfiles: + issue22992.patch

messages: + msg232194
2014-12-04 16:06:06demian.brechtsetmessages: + msg232141
2014-12-04 15:32:29brett.cannonsetnosy: + brett.cannon
2014-12-04 03:56:25berker.peksagsetnosy: + berker.peksag
messages: + msg232102

type: enhancement
stage: patch review
2014-12-04 01:00:55ezio.melottisetmessages: + msg232101
2014-12-04 00:09:13demian.brechtcreate