Almost Beta 2 Cloud Storage fails

Rick Walsh rickmwalsh at gmail.com
Mon Sep 21 13:30:58 PDT 2015


On 22 September 2015 at 03:24, Dirk Hohndel <dirk at hohndel.org> wrote:

> On Tue, Sep 22, 2015 at 01:17:06AM +1000, Rick Walsh wrote:
> > > > Cloud storage: checking connection to cloud server
> > > > git storage: fetch remote
> > > > git storage: check remote status
> > > > git storage: try to update
> > > > git storage: update remote
> > > >
> > > > Then Subsurface freezes.
> > >
> > > So it looks like it might be the git_remote_push() that hangs.
> > > Which is odd, given that the fetch above seems to have worked.
> > > I guess I need to add timestamps... how long did this take?
> > > I mean how long from start to the "update remote" and hang?
> > > A second? Five? Thirty?
> >
> > Something like a second. Maybe more but less than 5.
>
> So it's not timing out. Good.
> >
> > > Can you run a traceroute to cloud.subsurface-divelog.org ?
> >
> > Sleep now but I can tomorrow.
>
> No problem. Since it's only taking a second or two it's not a connectivity
> issue.
>

FWIW, here's the traceroute.

traceroute to cloud.subsurface-divelog.org (52.25.223.173), 30 hops max, 60
byte packets
 1  gateway (192.168.1.1)  1.045 ms  1.680 ms  1.775 ms
 2  lo0.bng1.mel4.on.ii.net (150.101.32.44)  14.045 ms  15.004 ms  16.828 ms
 3  ae4.cr1.mel4.on.ii.net (150.101.33.106)  17.808 ms  19.183 ms  20.463 ms
 4  ae2.br1.syd7.on.ii.net (150.101.33.28)  34.200 ms  50.582 ms  50.852 ms
 5  po-0-7-2-0.br1.nrt1.on.ii.net (150.101.33.201)  436.036 ms  436.024 ms
436.015 ms
 6  xe-0-0-0-2.r00.tokyjp03.jp.bb.gin.ntt.net (61.120.146.177)  436.005 ms
404.624 ms  375.433 ms
 7  ae-0.amazon.tokyjp03.jp.bb.gin.ntt.net (61.213.145.2)  375.400 ms
375.388 ms  375.379 ms
 8  27.0.0.228 (27.0.0.228)  375.371 ms  375.359 ms  375.348 ms
 9  * * *
10  205.251.232.74 (205.251.232.74)  351.057 ms 54.239.52.134
(54.239.52.134)  217.883 ms *
11  205.251.232.203 (205.251.232.203)  214.407 ms  208.944 ms  209.916 ms
12  205.251.232.61 (205.251.232.61)  234.650 ms 54.239.48.183
(54.239.48.183)  212.854 ms  214.187 ms
13  205.251.232.74 (205.251.232.74)  224.804 ms * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

I've never used traceroute before, so don't know if that's what it's
supposed to do, but I don't get the *** entries if for example I traceroute
www.subsurface-divelog.org

traceroute to www.subsurface-divelog.org (198.145.64.136), 30 hops max, 60
byte packets
 1  gateway (192.168.1.1)  1.056 ms  1.370 ms  1.530 ms
 2  lo0.bng1.mel4.on.ii.net (150.101.32.44)  13.901 ms  14.846 ms  16.215 ms
 3  ae4.cr1.mel4.on.ii.net (150.101.33.106)  20.502 ms  20.149 ms  21.290 ms
 4  ae2.br1.syd7.on.ii.net (150.101.33.28)  35.644 ms  36.782 ms  37.808 ms
 5  te0-2-1-2.br1.sjc2.on.ii.net (150.101.33.147)  197.029 ms  197.100 ms
te-0-2-1-3.br1.sjc2.on.ii.net (150.101.33.251)  184.741 ms
 6  snjs.equinix.twtelecom.net (206.223.116.36)  208.939 ms  199.477 ms
199.446 ms
 7  pdx1-ar3-xe-0-0-0-0.us.twtelecom.net (66.192.240.182)  192.158 ms
197.805 ms  194.834 ms
 8  66-193-100-90.static.twtelecom.net (66.193.100.90)  196.035 ms  197.627
ms  199.128 ms
 9  ge-8-2-20.acs-rtr05.ptldor02.iinet.com (198.145.240.166)  310.944 ms
288.200 ms  288.095 ms
10  mail.gr8dns.org (198.145.64.136)  193.716 ms  186.373 ms  187.093 ms


On my old computer, I get similar output for traceroute to be
cloud.subsurface-divelog.org and www.subsurface-divelog.org.


>
> > > What are your ping times?
>

[rick at notyourcomputer ~]$ ping cloud.subsurface-divelog.org -D
PING cloud.subsurface-divelog.org (52.25.223.173) 56(84) bytes of data.
^C
--- cloud.subsurface-divelog.org ping statistics ---
138 packets transmitted, 0 received, 100% packet loss, time 136999ms



> > > If you access the server over https with a browser, is it responsive?
> > >
> > The webview takes about a second to load/refresh
>
> My guess at this point is that somehow your local cache is messed up. That
> still really doesn't explain why we hang there, though.
>
> How good are you with a debugger? The best way to deal with this would be
> to set a break point on the call to git_remote_push() in git-access.c
> and then step through the next few instructions to see where we get stuck.
>
> If this isn't something you are comfortable with I can add more debug
> printouts that will tell us if it really is git_remote_push() or possibly
> something else that's causing the problem.
>
>
Does this help?

Breakpoint 1, update_remote (repo=0x21fca40, origin=0x23a18f0,
local=0x26fbe20, remote=0x21e7ea0, rt=RT_HTTPS) at
/home/rick/src/subsurface/git-access.c:244
244             if (git_remote_push(origin, &refspec, &opts)) {
(gdb) step
[Thread 0x7fff797b0700 (LWP 3640) exited]

(hangs)

^C
Program received signal SIGINT, Interrupt.
0x00007fffefaee063 in select () from /lib64/libc.so.6
(gdb) bt 20
#0  0x00007ffff630938c in wait_for.constprop () from
/home/rick/src/install-root/lib/libgit2.so.23
#1  0x00007ffff63093f4 in curls_write () from
/home/rick/src/install-root/lib/libgit2.so.23
#2  0x00007ffff6326a73 in bio_write () from
/home/rick/src/install-root/lib/libgit2.so.23
#3  0x00007ffff5d109dc in BIO_write () from /lib64/libcrypto.so.10
#4  0x00007ffff604c152 in ssl3_write_pending () from /lib64/libssl.so.10
#5  0x00007ffff604c814 in ssl3_write_bytes () from /lib64/libssl.so.10
#6  0x00007ffff6326c5d in openssl_write () from
/home/rick/src/install-root/lib/libgit2.so.23
#7  0x00007ffff63411ce in write_chunk () from
/home/rick/src/install-root/lib/libgit2.so.23
#8  0x00007ffff634248f in http_stream_write_chunked () from
/home/rick/src/install-root/lib/libgit2.so.23
#9  0x00007ffff6345c5d in stream_thunk () from
/home/rick/src/install-root/lib/libgit2.so.23
#10 0x00007ffff62c7995 in write_object () from
/home/rick/src/install-root/lib/libgit2.so.23
#11 0x00007ffff62cb927 in git_packbuilder_foreach () from
/home/rick/src/install-root/lib/libgit2.so.23
#12 0x00007ffff63473ce in git_smart.push () from
/home/rick/src/install-root/lib/libgit2.so.23
#13 0x00007ffff62de72a in git_push_finish () from
/home/rick/src/install-root/lib/libgit2.so.23
#14 0x00007ffff63315d0 in git_remote_upload () from
/home/rick/src/install-root/lib/libgit2.so.23
#15 0x00007ffff6331656 in git_remote_push () from
/home/rick/src/install-root/lib/libgit2.so.23
#16 0x00000000006aa892 in update_remote (repo=0x21fca40, origin=0x23a18f0,
local=0x26fbe20, remote=0x21e7ea0, rt=RT_HTTPS)
    at /home/rick/src/subsurface/git-access.c:244
#17 0x00000000006ab229 in try_to_update (repo=0x21fca40, origin=0x23a18f0,
local=0x26fbe20, remote=0x21e7ea0, rt=RT_HTTPS)
    at /home/rick/src/subsurface/git-access.c:414
#18 0x00000000006ab492 in check_remote_status (repo=0x21fca40,
origin=0x23a18f0, branch=0x2335170 "rickmwalsh at gmail.com", rt=RT_HTTPS)
    at /home/rick/src/subsurface/git-access.c:471
#19 0x00000000006ab735 in sync_with_remote (repo=0x21fca40,
remote=0x2334cd0 "
https://cloud.subsurface-divelog.org//git/rickmwalsh@gmail.com",
    branch=0x2335170 "rickmwalsh at gmail.com", rt=RT_HTTPS) at
/home/rick/src/subsurface/git-access.c:532
(More stack frames follow...)



> > > > I tried again, this time making a small change:
> > > >
> > > > [rick at notyourcomputer build]$ ./subsurface -v
> > > > Map theme file does not exist: ""
> > > > QInotifyFileSystemWatcherEngine::addPaths: inotify_add_watch failed:
> No
> > > > such file or directory
> > > > git storage: update local repo
> > > > sync with remote
> > > >
> >
> https://cloud.subsurface-divelog.org//git/rickmwalsh@gmail.com[rickmwalsh@gmail.com]
> > > > Cloud storage: checking connection to cloud server
> > > > Cloud storage: unable to connect to cloud server
> > >
> > > Now this is interesting. Because above this worked. But it does have a
> > > relatively short timeout. I'm wondering... if that's the issue. A
> really
> > > slow connection that times out sometimes and doesn't in others
> instances.
> >
>

I've repeated this a few times and it appears that if I open my log and do
save to cloud, then it hangs, but if I make a change before saving to cloud
then it updates the local, before the "Could not update Subsurface cloud
storage, try again later" message.


> > If the connection were too slow, it shouldn't work on my old laptop.
> Cloud
> > save works every time on the old and never on the new. I even plugged my
> > new laptop directly into the router in case it was the wifi.
>
> Yes, as I said above. I no longer think it's the connection.
> I wonder why it failed the second time around but that could just have
> been a fluke.
>
> > > > git storage: do git save
> > >
> > > Now it's only saving to your local cache because it wasn't able to
> reach
> > > the remote.
> > >
> > > > removed reference to non-existant dive site with uuid ebc33231
> > > > removed reference to non-existant dive site with uuid 9d62ab69
> > > > removed reference to non-existant dive site with uuid fb3df0ad
> > > > removed reference to non-existant dive site with uuid 9d62ab69
> > > > removed reference to non-existant dive site with uuid 13f22db5
> > > > ---many more lines like the above----
> > >
> > > I wonder why that happens. Is this the first time you've successfully
> > > saved in a while?
> > >
> >
> > Yes, the version on the cloud was saved from my old computer.
>
> OK. That makes more sense.
>
> Two possible things to try
>
> a) blow away the local cache (it's in
> ~/.local/shared/Subsurface/Subsurface/<sha of the remote>)
>    and try again. While that should cause things to just freeze, that is a
>    slight possibility
>

I tried that.  Here's the gdb session.

Using host libthread_db library "/lib64/libthread_db.so.1".
Map theme file does not exist: ""
QInotifyFileSystemWatcherEngine::addPaths: inotify_add_watch failed: No
such file or directory
[New Thread 0x7fff78e1e700 (LWP 5063)]
git storage: create_local_repo
Cloud storage: checking connection to cloud server
Cloud storage: unable to connect to cloud server
[Thread 0x7fff78e1e700 (LWP 5063) exited]

Error message at base of screen: Unable to open git repository '
https://cloud.subsurface-divelog.org//git/rickmwalsh@gmail.com[rickmwalsh@gmail.com]
'
(Contacting cloud service progress bar shows no progress.  Hit cancel)

^C[New Thread 0x7fff7961f700 (LWP 5358)]
[New Thread 0x7fff7bfff700 (LWP 5348)]
[New Thread 0x7fff80f68700 (LWP 5347)]
[New Thread 0x7fff81769700 (LWP 5346)]
[New Thread 0x7fff81f6a700 (LWP 5345)]
[New Thread 0x7fffc2b7c700 (LWP 5344)]
[New Thread 0x7fffc9ef9700 (LWP 5343)]
[New Thread 0x7fffd7a68700 (LWP 5342)]

Program received signal SIGINT, Interrupt.
0x00007fffefaec2fd in poll () at ../sysdeps/unix/syscall-template.S:81
81      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
Missing separate debuginfos, use: dnf debuginfo-install
bzip2-libs-1.0.6-14.fc22.x86_64 cyrus-sasl-lib-2.1.26-22.fc22.x86_64
cyrus-sasl-lib
----snip---
gdb) bt 20
#0  0x00007fffefaec2fd in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fffea6b4dbc in g_main_context_iterate.isra () from
/lib64/libglib-2.0.so.0
#2  0x00007fffea6b4ecc in g_main_context_iteration () from
/lib64/libglib-2.0.so.0
#3  0x00007ffff0627d8f in
QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>)
() from /lib64/libQt5Core.so.5
#4  0x00007ffff05cedaa in
QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from
/lib64/libQt5Core.so.5
#5  0x00007ffff05d6e6c in QCoreApplication::exec() () from
/lib64/libQt5Core.so.5
#6  0x00000000004e9b50 in run_ui () at
/home/rick/src/subsurface/qt-gui.cpp:72
#7  0x00000000004e8ada in main (argc=2, argv=0x7fffffffded8) at
/home/rick/src/subsurface/main.cpp:78

b) continue with the debugging suggested above
>

Does any of the above give a clue what's causing the problem?

Cheers,

Rick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20150922/c35f090e/attachment-0001.html>


More information about the subsurface mailing list