Ticket #1163 (closed defect: worksforme)

Opened 4 years ago

Last modified 14 months ago

rtorrent freezes under FreeBSD 6.2-R

Reported by: nitro2k01@gmail.com Owned by: rakshasa
Priority: normal Component: libtorrent
Version: Severity: major
Keywords: Cc: nitro2k01@gmail.com

Description

This is a bug that first occured in rTorrent 0.6.3/0.10.3 out of nowhere (At the time I had been using that version without problems for months) I recently upgraded to rTorrent 0.7.9/0.11.9 without any improvement. The bug is the following: The GUI of rtorrent completely freezes for several minutes, then becomes responsive for about 30 seconds before it freezes again. I haven't investigated it further, but I think any transfers will be stopped during the time the application is frozen as well. During the time the application is frozen, the process state is KQREAD and the CPU usage is 0%. During the time it works normally, it's in the SELECT state and the CPU usage is normal. (5-20%) Switching between select and kqueue based polling does not make a difference. The operating system is FreeBSD 6.2-Release running on the Sparc64 platform. The versions of rtorrent/libtorrent I'm using are the ones in the Ports collection. The problem could somehow be related to the trackers waffles.fm and/or what.cd, but I haven't been able to conclude that that is the case.

Attachments

rtorrent_strace_sort_uniq-time-gaps-only.txt Download (10.1 KB) - added by BT <do.it@i.ua> 2 years ago.
strace log of rtorrent freezing, timestamp gaps only (+/- 7 strace.log lines each time gap)

Change History

  Changed 4 years ago by rakshasa

I would suspect that you have problems with DNS lookups blocking on tracker requests. Try either compiling libcurl with async dns lookup support, or figure out what tracker urls cause the problems.

  Changed 4 years ago by nitro2k01@gmail.com

When testing things, I tried disabling udp trackers, which didn't change anything. I also removed torrents from no longer existing trackers, which seemed to solve the problem. For now I draw the conclusion that DNS problems was what caused it. (I'll go through the rest of my torrents for non-existing domain names though.) Is there no (simple) way to solve this bug without recompiling libcurl? Maybe put DNS lookups in a separate thread? (I'm thinking about ensuring the highest possible quality of the program) Still consider this a formal bug report, but please change the severity to minor.

  Changed 4 years ago by carpetsmoker@xs4all.nl

I have the same problem on FreeBSD 7, if rtorrent is unable to lookup DNS records (For any reason) then it freezes until the lookup will timeout.

Recompiling curl with async DNS (c-ares) doesn't help...

  Changed 4 years ago by anonymous

I have the same problems as described in the description.

I'm on Mac, OS X Leopard 10.5.2.

Latest devel libtorrent and rtorrent.

  Changed 4 years ago by anonymous

I second this.

I have the same problem on the following system: [22:07:35][flagel@rose /crypt/5/Audio_Scene] uname -a FreeBSD rose.qwqb.com 6.2-STABLE FreeBSD 6.2-STABLE #0: Thu Feb 15 15:07:26 CET 2007  flagel@rose.qwqb.com:/usr/obj/usr/src/sys/GENERIC i386

A fix would be very much appreciated.

  Changed 4 years ago by anonymous

I have the same problem on 'FreeBSD <host> 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Mon Feb 25 09:35:41 UTC 2008  root@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC sparc64'.

rTorrent: 0.7.7 libTorrent: 0.11.7

Perhaps using more threads would resolve this?

  Changed 2 years ago by do.it@i.ua

I believe I could have the same problem, but I'm unsure due to a specific setup.

I'm using rtorrent/cygwin (using 'select') with exactly the same sessions/downloads directories, as rtorrent/debian on the same dual-boot machine (rtorrent/debian runs perfectly, using 'epoll'). I've configured mount points to look exactly the same under the two OSes - /mnt/d/exchange/.torrents/ for sessions, and /mnt/d/exchange/ for downloaded items. For Cygwin, I used instructions from  http://rtwi.jmk.hu/wiki/rTorrentOnWindows and a patch from  http://rtwi.jmk.hu/downloads/misc/rtow.diff. Libtorrent is compiled with --disable-mincore, and I haven't enabled the XMLRPC-C.

Strangely, under Cygwin rtorrent reports "Tracker: [Couldn't resolve hostname]", while both manual nslookup succeeds, and there is no such problem in rtorrent/debian.

I've tried both SVN and "stable" versions, but both behaved the same.

While using stable version, I've tried compiling c-ares, then recompiling libcurl and libtorrent - but that hasn't changed anything.

While "freezing", rtorrent CPU use is a straight 100% line, with seemingly non-periodic sharp deviations towards 0% CPU use. Interface is responding very slowly.

After running for a while (5-10 minutes), freezes get shorter, and 0%-CPU periods get longer. I still have to check that, but freezes might be completely gone after a while. At least no freezes between 20 and 40 minutes running rtorrent in the current session.

Currently I only have a 500MiB strace.log file of the rtorrent session, which has "gaps" on the order of dozens of seconds (and maybe minutes, didn't have time to examine it all). I'll post more details (and a gdb backtrace) as they become available.

Any suggestions/hints would be welcome. Does anyone else (e.g. on FreeBSD systems) observe rtorrent "getting back to normal" after some time running? Give it at least 30 minutes to be sure.

  Changed 2 years ago by do.it@i.ua

Amendment: no, freezes do not go away completely - random-length 100% CPU use spikes still occur after running for a while.

  Changed 2 years ago by BT <do.it@i.ua>

I have confirmed that my "Tracker: [Couldn't resolve hostname]" problem is the same as mentioned here: http://libtorrent.rakshasa.no/ticket/1481#comment:5

Will now check if fixing it has any effect on the freezing.

  Changed 2 years ago by BT <do.it@i.ua>

Amendment to my previous comment: the exact message cURL 7.19.6 generates is $ curl  http://bt.somesite.tv/ curl: (6) Could not resolve host: bt.somesite.tv (Could not contact DNS servers) I haven't yet found a way to fix it.

  Changed 2 years ago by BT <do.it@i.ua>

gdb -p rtorrent.PID during "freeze" gave this:

(gdb) bt
#0  0x7c90120f in ntdll!DbgUiConnectToDbg () from /mnt/c/WINDOWS/system32/ntdll.dll
#1  0x7c951e40 in ntdll!KiIntSystemCall () from /mnt/c/WINDOWS/system32/ntdll.dll
#2  0x00000005 in ?? ()
#3  0x00000004 in ?? ()
#4  0x00000001 in ?? ()
#5  0x1b81ffd0 in ?? ()
#6  0xba340548 in ?? ()
#7  0xffffffff in ?? ()
#8  0x7c90e920 in strchr () from /mnt/c/WINDOWS/system32/ntdll.dll
#9  0x7c951e60 in ntdll!KiIntSystemCall () from /mnt/c/WINDOWS/system32/ntdll.dll
#10 0x00000000 in ?? ()
(gdb) list
138       if (taskScheduler.empty())
139         return c->is_shutdown_started() ? rak::timer::from_milliseconds(100) : rak::timer::from_seconds(60);
140       else if (taskScheduler.top()->time() <= cachedTime)
141         return 0;
142       else
143         return taskScheduler.top()->time() - cachedTime;
144     }
145
146     int
147     main(int argc, char** argv) {

Recompiling with --enable-extra-debug hasn't changed anything - gdb during 'freeze' still shows exactly the same backtrace and code.

Changed 2 years ago by BT <do.it@i.ua>

strace log of rtorrent freezing, timestamp gaps only (+/- 7 strace.log lines each time gap)

  Changed 2 years ago by BT <do.it@i.ua>

Could someone please help interpreting the strace.log fragment I've attached? It is only 162 lines. I can publish complete strace.log somewhere on the web, if needed.

  Changed 2 years ago by BT <do.it@i.ua>

The number of close_wait connections easily goes to a thousand within 25 minutes of rtorrent uptime - for some reason, many connections just don't terminate (I've watched a few to verify that). Could this be related to freezing?

  Changed 2 years ago by anonymous

I don't know how FreeBSD handles connections in close_wait. But they would belong to peers that don't have proper port forwarding or are firewalled, i.e. those that time out when rtorrent tries to connect to them.

Maybe there's a setting in FreeBSD to timeout these connections faster, rtorrent has an internal timeout of 60 seconds and closes the connection if it doesn't respond within that time. However if FreeBSD keeps them around for much longer then they can easily pile up.

Whether that is relevant to this ticket, I cannot say either.

follow-up: ↓ 16   Changed 2 years ago by do.it@i.ua

How do I force the use of 'select' instead of 'epoll' in rtorrent? (Preferably without recompilation)

That will help rule out the possibility that rtorrent's "select" call is the culprit.

in reply to: ↑ 15   Changed 2 years ago by anonymous

Replying to do.it@…:

How do I force the use of 'select' instead of 'epoll' in rtorrent? (Preferably without recompilation) That will help rule out the possibility that rtorrent's "select" call is the culprit.

In rtorrent 0.8.5, just set the RTORRENT_POLL=select environment variable.

  Changed 2 years ago by do.it@i.ua

That worked, thanks! rtorrent 0.8.5 using 'select' on Debian doesn't freeze.

Freezing could be due to a bug in cygwin_select (and FreeBSD's select, for that matter). Alternatively, these two select() implementations may have some differences in the number/order of function parameters (I'm guessing here).

At the moment, this doesn't look like a libtorrent/rtorrent bug, but I'll keep posting here if I find something relevant.

So far I've found several relevant reports, but they are for older Cygwin versions (I'm using 1.7b); I haven't searched for FreeBSD select() bug/problem reports:  http://www.cygwin.com/ml/cygwin/2005-05/msg01200.html  http://lists.denx.de/pipermail/u-boot/2008-April/032507.html

the most recent post claims this is fixed in 1.7:  http://old.nabble.com/wget,-ssh,-ssh-agent-hang-in-socket_cleanup-td22144268.html

  Changed 2 years ago by BT <do.it@i.ua>

I have posted this problem to the cygwin mailing list at  http://cygwin.com/ml/cygwin/2009-11/msg00963.html

  Changed 2 years ago by BT <do.it@i.ua>

 Here Corinna Vinschen mentions, that Cygwin's getopt() descends from OpenBSD. If this is also the case for the rest of Cygwin codebase, then it proves that the problem is somewhere between rTorrent and *BSD...

  Changed 2 years ago by anonymous

what does getopt have to do with anything, rtorrent doesn't even use it

  Changed 2 years ago by BT <do.it@i.ua>

It doesn't, I was just pointing out the possible link between rtorrent freezes on FreeBSD (as reported by the OP), and rtorrent freezes on cygwin (as reported by me).

  Changed 2 years ago by BT <do.it@i.ua>

I've "solved" :) freeze problems with rtorrent under cygwin by using this  pre-compiled binary. I don't know what I was doing wrong with my compilation, but this binary doesn't freeze.

  Changed 16 months ago by anonymous

I have observed the same issue with cygwin and rtorrent 0.8.7/0.12.7. I have applied patch as well. After a bit of debugging it came out that freezing is caused by the change introduced in the patch. Specifically, in the directory.cc file, the following lines were added:

std::string full_path=path+'/';
full_path+=entry->d_name;
    
struct stat sb;
if(stat(full_path.c_str(),&sb))
  continue;

in my case, calls to stat function took about 60-120 seconds. I had no intention to investigate this specific issue so I have just commented this stuff out and used the original code.

  Changed 15 months ago by MelanyFenton <for_tol-melany@yahoo.com>

I had such problem too.  dissertation writer

  Changed 14 months ago by rakshasa

  • status changed from new to closed
  • resolution set to worksforme
Note: See TracTickets for help on using tickets.