git submodules

Linus Torvalds torvalds at linux-foundation.org
Fri Nov 10 13:05:27 PST 2017


On Fri, Nov 10, 2017 at 12:35 PM, Dirk Hohndel <dirk at hohndel.org> wrote:
>
>>
>> So I think it could be a good idea, but I do want to warn you:
>> submodules can be confusing. I would expect the occasional colorful
>> language until people get used to them.
>
> Oh this sounds so promising.
> Do tell us about the hiccups you expect - assuming there are “typical”
> pitfalls beyond “our build scripts and tooling need to change”.

The most common issue is that people expect submodules to be more
tightly integrated than they are.

Most operations will *not* recurse into the submodule, even if it
looks like it is part of the tree.  It very much is a separate git
repository, and most git commands won't actually look at the
submodules unless you explicitly ask for it.

In many ways git submodule behavior actually mirrors what we do now:
you fetch the libdivecomputer project separately, and you work in it
separately.

Many (but not all) git commands will then have specific flags to say
"do this recursively for submodules too".  For example, if you use
branches, and do "git checkout", by default it does *not* follow the
submodule. You'd have to do that either by hand, or use "git checkout
--recurse-submodule" to do it for you.

But if you don't update the subproject, you'll see in "git status" and
"git diff" that the submodule doesn't match the current HEAD. So there
_is_ integration, it's just that by default the submodules are
considered fairly separate.

And things like "git pull" when you pull something that has a
different version of libdivecomputer get even more complex. You now
have the added question of "do you want to merge the subproject
changes too too"

In other words, the default git behavior is actually very close to our
current workflow ("separate project"), just with "added tracking", and
with the _option_ to recurse into the submodules.

But people often expect a much more tightly integrated model, where
the subprojects always get updated together with the top-level
project, and that's not how it works. It's mainly a "allow tracking"
model together with support for explicitly recursing if you want to.

Actually, the best way to explain what I mean might be "git diff". You
can show the differences in submodules in different ways:

 - just show the "submodule used to be commit X, now it's at commit Y"
(default, aka "short" format)

 - show the diff of the submodule recursively ("--submodule=diff":
this is the one that basically tries to make the submodule look
integral to the top-level project)

 - show the diff as the log of commits in the submodule
("--submodule=log": this shows the diff as "what changed").

You can see this in the qt5 project that uses submodules extensively.
It's actually very powerful, but it can be *confusing*.

So go to your qt5 source tree, and do this:

   git log -p

and you'll see things like

  commit 28ffb0ce8a2f1a9d2003607114ef131cc71d849e
  Author: Qt Submodule Update Bot <qt_submodule_update_bot at qt-project.org>
  Date:   Mon Sep 25 23:02:12 2017 +0300

    Update submodules on '5.9' in qt5

    Change-Id: I67e1530a32c84d0eb3b9bbb702922a6ae4f20362
    Reviewed-by: Liang Qi <liang.qi at qt.io>

  diff --git a/qt3d b/qt3d
  index ba9a38c..72e8052 160000
  --- a/qt3d
  +++ b/qt3d
  @@ -1 +1 @@
  -Subproject commit ba9a38ceca15f9bc086a6c9c5d341001e9e73852
  +Subproject commit 72e80520d36802672eca1e93bc6c6019e6f5ffc3
  ...

and then compare it with "git log -p --subproject=log", and you'll see
something very convenient as a maintainer:

  commit 28ffb0ce8a2f1a9d2003607114ef131cc71d849e
  Author: Qt Submodule Update Bot <qt_submodule_update_bot at qt-project.org>
  Date:   Mon Sep 25 23:02:12 2017 +0300

    Update submodules on '5.9' in qt5

    Change-Id: I67e1530a32c84d0eb3b9bbb702922a6ae4f20362
    Reviewed-by: Liang Qi <liang.qi at qt.io>

  Submodule qt3d ba9a38cec..72e80520d:
    > Copy size and pixelRatio when switching activeFrameGraph
  Submodule qtbase b0ffb332f2..bd72ead4d1:
    > Fix docs about QMAKESPEC in INCLUDEPATH
    > Windows QPA: Move function to find screen by HWND to QWindowsScreenManager
    > Make QDateTimeParser a separate feature
    > doc: fix code snippet of qConstOverload usage
    > moc: don't use const_cast in qt_metacast generated code
    > Fix namespaced build on macOS
    > Fix typo in debug statement: QPlatformScren -> QPlatformScreen
    > Fix parsing of tzfile(5) files in QTimeZonePrivate
  ...

but maybe you wanted the actual recursive diff: "git log -p
--subproject=diff" (and then you see what the diffs were in all the
subprojects when the top-level project updated from one version of a
subproject to another).

See? It's just more complex than I think people expect. It's
_powerful_, and if you have the right expectations you'll love it, but
it does need those right expectations.

               Linus


More information about the subsurface mailing list