Almost Beta 2 Cloud Storage fails

Dirk Hohndel dirk at hohndel.org
Mon Sep 21 13:48:21 PDT 2015


Sorry, on my phone. You clearly do have connectivity issues with the cloud server which is actually an AWS instance. 

Hmmm 

That's annoying. I went with a cloud server because i assumed that would cause the fewest connectivity issues. Clearly wrong. 

/D

-- 
Sent from my phone

> On Sep 21, 2015, at 13:30, Rick Walsh <rickmwalsh at gmail.com> wrote:
> 
> 
> 
>> On 22 September 2015 at 03:24, Dirk Hohndel <dirk at hohndel.org> wrote:
>> On Tue, Sep 22, 2015 at 01:17:06AM +1000, Rick Walsh wrote:
>> > > > Cloud storage: checking connection to cloud server
>> > > > git storage: fetch remote
>> > > > git storage: check remote status
>> > > > git storage: try to update
>> > > > git storage: update remote
>> > > >
>> > > > Then Subsurface freezes.
>> > >
>> > > So it looks like it might be the git_remote_push() that hangs.
>> > > Which is odd, given that the fetch above seems to have worked.
>> > > I guess I need to add timestamps... how long did this take?
>> > > I mean how long from start to the "update remote" and hang?
>> > > A second? Five? Thirty?
>> >
>> > Something like a second. Maybe more but less than 5.
>> 
>> So it's not timing out. Good.
>> >
>> > > Can you run a traceroute to cloud.subsurface-divelog.org ?
>> >
>> > Sleep now but I can tomorrow.
>> 
>> No problem. Since it's only taking a second or two it's not a connectivity
>> issue.
> 
> FWIW, here's the traceroute.
> 
> traceroute to cloud.subsurface-divelog.org (52.25.223.173), 30 hops max, 60 byte packets
>  1  gateway (192.168.1.1)  1.045 ms  1.680 ms  1.775 ms
>  2  lo0.bng1.mel4.on.ii.net (150.101.32.44)  14.045 ms  15.004 ms  16.828 ms
>  3  ae4.cr1.mel4.on.ii.net (150.101.33.106)  17.808 ms  19.183 ms  20.463 ms
>  4  ae2.br1.syd7.on.ii.net (150.101.33.28)  34.200 ms  50.582 ms  50.852 ms
>  5  po-0-7-2-0.br1.nrt1.on.ii.net (150.101.33.201)  436.036 ms  436.024 ms  436.015 ms
>  6  xe-0-0-0-2.r00.tokyjp03.jp.bb.gin.ntt.net (61.120.146.177)  436.005 ms  404.624 ms  375.433 ms
>  7  ae-0.amazon.tokyjp03.jp.bb.gin.ntt.net (61.213.145.2)  375.400 ms  375.388 ms  375.379 ms
>  8  27.0.0.228 (27.0.0.228)  375.371 ms  375.359 ms  375.348 ms
>  9  * * *
> 10  205.251.232.74 (205.251.232.74)  351.057 ms 54.239.52.134 (54.239.52.134)  217.883 ms *
> 11  205.251.232.203 (205.251.232.203)  214.407 ms  208.944 ms  209.916 ms
> 12  205.251.232.61 (205.251.232.61)  234.650 ms 54.239.48.183 (54.239.48.183)  212.854 ms  214.187 ms
> 13  205.251.232.74 (205.251.232.74)  224.804 ms * *
> 14  * * *
> 15  * * *
> 16  * * *
> 17  * * *
> 18  * * *
> 19  * * *
> 20  * * *
> 21  * * *
> 22  * * *
> 23  * * *
> 24  * * *
> 25  * * *
> 26  * * *
> 27  * * *
> 28  * * *
> 29  * * *
> 30  * * *
> 
> I've never used traceroute before, so don't know if that's what it's supposed to do, but I don't get the *** entries if for example I traceroute www.subsurface-divelog.org
> 
> traceroute to www.subsurface-divelog.org (198.145.64.136), 30 hops max, 60 byte packets
>  1  gateway (192.168.1.1)  1.056 ms  1.370 ms  1.530 ms
>  2  lo0.bng1.mel4.on.ii.net (150.101.32.44)  13.901 ms  14.846 ms  16.215 ms
>  3  ae4.cr1.mel4.on.ii.net (150.101.33.106)  20.502 ms  20.149 ms  21.290 ms
>  4  ae2.br1.syd7.on.ii.net (150.101.33.28)  35.644 ms  36.782 ms  37.808 ms
>  5  te0-2-1-2.br1.sjc2.on.ii.net (150.101.33.147)  197.029 ms  197.100 ms te-0-2-1-3.br1.sjc2.on.ii.net (150.101.33.251)  184.741 ms
>  6  snjs.equinix.twtelecom.net (206.223.116.36)  208.939 ms  199.477 ms  199.446 ms
>  7  pdx1-ar3-xe-0-0-0-0.us.twtelecom.net (66.192.240.182)  192.158 ms  197.805 ms  194.834 ms
>  8  66-193-100-90.static.twtelecom.net (66.193.100.90)  196.035 ms  197.627 ms  199.128 ms
>  9  ge-8-2-20.acs-rtr05.ptldor02.iinet.com (198.145.240.166)  310.944 ms  288.200 ms  288.095 ms
> 10  mail.gr8dns.org (198.145.64.136)  193.716 ms  186.373 ms  187.093 ms
> 
> 
> On my old computer, I get similar output for traceroute to be cloud.subsurface-divelog.org and www.subsurface-divelog.org.
>  
>> 
>> > > What are your ping times?
> 
> [rick at notyourcomputer ~]$ ping cloud.subsurface-divelog.org -D
> PING cloud.subsurface-divelog.org (52.25.223.173) 56(84) bytes of data.
> ^C
> --- cloud.subsurface-divelog.org ping statistics ---
> 138 packets transmitted, 0 received, 100% packet loss, time 136999ms
> 
>  
>> > > If you access the server over https with a browser, is it responsive?
>> > >
>> > The webview takes about a second to load/refresh
>> 
>> My guess at this point is that somehow your local cache is messed up. That
>> still really doesn't explain why we hang there, though.
>> 
>> How good are you with a debugger? The best way to deal with this would be
>> to set a break point on the call to git_remote_push() in git-access.c
>> and then step through the next few instructions to see where we get stuck.
>> 
>> If this isn't something you are comfortable with I can add more debug
>> printouts that will tell us if it really is git_remote_push() or possibly
>> something else that's causing the problem.
> 
> Does this help?
> 
> Breakpoint 1, update_remote (repo=0x21fca40, origin=0x23a18f0, local=0x26fbe20, remote=0x21e7ea0, rt=RT_HTTPS) at /home/rick/src/subsurface/git-access.c:244
> 244             if (git_remote_push(origin, &refspec, &opts)) {
> (gdb) step
> [Thread 0x7fff797b0700 (LWP 3640) exited]
> 
> (hangs)
> 
> ^C
> Program received signal SIGINT, Interrupt.
> 0x00007fffefaee063 in select () from /lib64/libc.so.6
> (gdb) bt 20
> #0  0x00007ffff630938c in wait_for.constprop () from /home/rick/src/install-root/lib/libgit2.so.23
> #1  0x00007ffff63093f4 in curls_write () from /home/rick/src/install-root/lib/libgit2.so.23
> #2  0x00007ffff6326a73 in bio_write () from /home/rick/src/install-root/lib/libgit2.so.23
> #3  0x00007ffff5d109dc in BIO_write () from /lib64/libcrypto.so.10
> #4  0x00007ffff604c152 in ssl3_write_pending () from /lib64/libssl.so.10
> #5  0x00007ffff604c814 in ssl3_write_bytes () from /lib64/libssl.so.10
> #6  0x00007ffff6326c5d in openssl_write () from /home/rick/src/install-root/lib/libgit2.so.23
> #7  0x00007ffff63411ce in write_chunk () from /home/rick/src/install-root/lib/libgit2.so.23
> #8  0x00007ffff634248f in http_stream_write_chunked () from /home/rick/src/install-root/lib/libgit2.so.23
> #9  0x00007ffff6345c5d in stream_thunk () from /home/rick/src/install-root/lib/libgit2.so.23
> #10 0x00007ffff62c7995 in write_object () from /home/rick/src/install-root/lib/libgit2.so.23
> #11 0x00007ffff62cb927 in git_packbuilder_foreach () from /home/rick/src/install-root/lib/libgit2.so.23
> #12 0x00007ffff63473ce in git_smart.push () from /home/rick/src/install-root/lib/libgit2.so.23
> #13 0x00007ffff62de72a in git_push_finish () from /home/rick/src/install-root/lib/libgit2.so.23
> #14 0x00007ffff63315d0 in git_remote_upload () from /home/rick/src/install-root/lib/libgit2.so.23
> #15 0x00007ffff6331656 in git_remote_push () from /home/rick/src/install-root/lib/libgit2.so.23
> #16 0x00000000006aa892 in update_remote (repo=0x21fca40, origin=0x23a18f0, local=0x26fbe20, remote=0x21e7ea0, rt=RT_HTTPS)
>     at /home/rick/src/subsurface/git-access.c:244
> #17 0x00000000006ab229 in try_to_update (repo=0x21fca40, origin=0x23a18f0, local=0x26fbe20, remote=0x21e7ea0, rt=RT_HTTPS)
>     at /home/rick/src/subsurface/git-access.c:414
> #18 0x00000000006ab492 in check_remote_status (repo=0x21fca40, origin=0x23a18f0, branch=0x2335170 "rickmwalsh at gmail.com", rt=RT_HTTPS)
>     at /home/rick/src/subsurface/git-access.c:471
> #19 0x00000000006ab735 in sync_with_remote (repo=0x21fca40, remote=0x2334cd0 "https://cloud.subsurface-divelog.org//git/rickmwalsh@gmail.com", 
>     branch=0x2335170 "rickmwalsh at gmail.com", rt=RT_HTTPS) at /home/rick/src/subsurface/git-access.c:532
> (More stack frames follow...)
> 
>  
>> > > > I tried again, this time making a small change:
>> > > >
>> > > > [rick at notyourcomputer build]$ ./subsurface -v
>> > > > Map theme file does not exist: ""
>> > > > QInotifyFileSystemWatcherEngine::addPaths: inotify_add_watch failed: No
>> > > > such file or directory
>> > > > git storage: update local repo
>> > > > sync with remote
>> > > >
>> > https://cloud.subsurface-divelog.org//git/rickmwalsh@gmail.com[rickmwalsh@gmail.com]
>> > > > Cloud storage: checking connection to cloud server
>> > > > Cloud storage: unable to connect to cloud server
>> > >
>> > > Now this is interesting. Because above this worked. But it does have a
>> > > relatively short timeout. I'm wondering... if that's the issue. A really
>> > > slow connection that times out sometimes and doesn't in others instances.
>> >
> 
> I've repeated this a few times and it appears that if I open my log and do save to cloud, then it hangs, but if I make a change before saving to cloud then it updates the local, before the "Could not update Subsurface cloud storage, try again later" message.
>  
>> > If the connection were too slow, it shouldn't work on my old laptop. Cloud
>> > save works every time on the old and never on the new. I even plugged my
>> > new laptop directly into the router in case it was the wifi.
>> 
>> Yes, as I said above. I no longer think it's the connection.
>> I wonder why it failed the second time around but that could just have
>> been a fluke.
>> 
>> > > > git storage: do git save
>> > >
>> > > Now it's only saving to your local cache because it wasn't able to reach
>> > > the remote.
>> > >
>> > > > removed reference to non-existant dive site with uuid ebc33231
>> > > > removed reference to non-existant dive site with uuid 9d62ab69
>> > > > removed reference to non-existant dive site with uuid fb3df0ad
>> > > > removed reference to non-existant dive site with uuid 9d62ab69
>> > > > removed reference to non-existant dive site with uuid 13f22db5
>> > > > ---many more lines like the above----
>> > >
>> > > I wonder why that happens. Is this the first time you've successfully
>> > > saved in a while?
>> > >
>> >
>> > Yes, the version on the cloud was saved from my old computer.
>> 
>> OK. That makes more sense.
>> 
>> Two possible things to try
>> 
>> a) blow away the local cache (it's in ~/.local/shared/Subsurface/Subsurface/<sha of the remote>)
>>    and try again. While that should cause things to just freeze, that is a
>>    slight possibility
> 
> I tried that.  Here's the gdb session.
> 
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Map theme file does not exist: ""
> QInotifyFileSystemWatcherEngine::addPaths: inotify_add_watch failed: No such file or directory
> [New Thread 0x7fff78e1e700 (LWP 5063)]
> git storage: create_local_repo
> Cloud storage: checking connection to cloud server
> Cloud storage: unable to connect to cloud server
> [Thread 0x7fff78e1e700 (LWP 5063) exited]
> 
> Error message at base of screen: Unable to open git repository 'https://cloud.subsurface-divelog.org//git/rickmwalsh@gmail.com[rickmwalsh@gmail.com]'
> (Contacting cloud service progress bar shows no progress.  Hit cancel)
> 
> ^C[New Thread 0x7fff7961f700 (LWP 5358)]
> [New Thread 0x7fff7bfff700 (LWP 5348)]
> [New Thread 0x7fff80f68700 (LWP 5347)]
> [New Thread 0x7fff81769700 (LWP 5346)]
> [New Thread 0x7fff81f6a700 (LWP 5345)]
> [New Thread 0x7fffc2b7c700 (LWP 5344)]
> [New Thread 0x7fffc9ef9700 (LWP 5343)]
> [New Thread 0x7fffd7a68700 (LWP 5342)]
> 
> Program received signal SIGINT, Interrupt.
> 0x00007fffefaec2fd in poll () at ../sysdeps/unix/syscall-template.S:81
> 81      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
> Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.6-14.fc22.x86_64 cyrus-sasl-lib-2.1.26-22.fc22.x86_64 cyrus-sasl-lib
> ----snip---
> gdb) bt 20
> #0  0x00007fffefaec2fd in poll () at ../sysdeps/unix/syscall-template.S:81
> #1  0x00007fffea6b4dbc in g_main_context_iterate.isra () from /lib64/libglib-2.0.so.0
> #2  0x00007fffea6b4ecc in g_main_context_iteration () from /lib64/libglib-2.0.so.0
> #3  0x00007ffff0627d8f in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Core.so.5
> #4  0x00007ffff05cedaa in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Core.so.5
> #5  0x00007ffff05d6e6c in QCoreApplication::exec() () from /lib64/libQt5Core.so.5
> #6  0x00000000004e9b50 in run_ui () at /home/rick/src/subsurface/qt-gui.cpp:72
> #7  0x00000000004e8ada in main (argc=2, argv=0x7fffffffded8) at /home/rick/src/subsurface/main.cpp:78
> 
>> b) continue with the debugging suggested above
> 
> Does any of the above give a clue what's causing the problem?
> 
> Cheers,
> 
> Rick
>  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.subsurface-divelog.org/pipermail/subsurface/attachments/20150921/f31ad21a/attachment-0001.html>


More information about the subsurface mailing list