Ticket #460 (closed defect: fixed)

Opened 3 years ago

Last modified 11 hours ago

Option to pre allocate files.

Reported by: aagaande@gmail.com Owned by: rakshasa
Priority: normal Milestone:
Component: libtorrent Version:
Severity: normal Keywords:
Cc:

Description

It'd be nice to have an option to pre allocate all files (0byte padding for example) to get files written as much in order on the disk as possible. This helps prevent fragmentation a lot.

Attachments

alloc.2.diff (5.9 KB) - added by rakshasa 4 weeks ago.
Try this patch
file_extents.c (2.2 KB) - added by rakshasa 4 weeks ago.
File extents, requires root on linux

Change History

  Changed 3 years ago by Szafran

yup... this would help a lot

  Changed 3 years ago by anonymous

In my opinion it's the most missing feature. Other than that rtorrent has all the features I can think of.

  Changed 3 years ago by anonymous

Ditto. After using rtorrent for a week on a fresh partition "xfs_frag -c frag" just showed a minor fragmentation level of 99% :-(

in reply to: ↑ description   Changed 3 years ago by anonymous

Currently I see that rtorrent does allocate sparse files to "preallocate" the files, but this isn't exactly the same thing. While the sparse files are nice, the true 0-filled preallocated files will help against filesystem fragmentation. I ended up with an ISO image with 28000 fragments on an ext3 partition after downloading(!)

  Changed 2 years ago by gentoo_nk

I agree that preallocation with zero-ing (to avoid sparse files) would be a nice feature to have.

I just downloaded a linux DVD image... Result: The 4.3GB file (4338*1MB chunks) was spread over 30275 fragments! That's almost 8 fragments per torrent chunk. (One would expect that at least every chunk is written to the disk at once. That's not the case, obviously, with libtorrent, and it isn't a bad thing... but extreme fragmentation is the result.)

Sparse files are only useful as long as they remain sparse... which doesn't happen with bittorrent.

  Changed 2 years ago by nisse@loodin.se

Is this something that is planned for any of the releases?

follow-up: ↓ 16   Changed 2 years ago by loswillios

there's a patch available:  http://pastebin.ca/raw/885247

you need a kernel ≥ 2.6.23. make sure to use it only on XFS filesystems

thanks to void

  Changed 2 years ago by anonymous

ecryptfs users would benefit from such an option too, since ecryptfs can't handle sparse files.

See  discussion for further details.

follow-up: ↓ 10   Changed 2 years ago by anonymous

The following  workaround zeroes every file and thus should leave no holes.

It's understood that with this patch, the creation of large files will take longer.

in reply to: ↑ 9 ; follow-up: ↓ 21   Changed 2 years ago by anonymous

Replying to anonymous:

The following  workaround zeroes every file and thus should leave no holes.


That one was a bit `leaky' -- please use  this patch instead.

follow-ups: ↓ 12 ↓ 18   Changed 2 years ago by anonymous

Ultimately its a job for the filesystem, not for each and every program to do their own thing. ext4 is supposed to have both persistent preallocation and online defragmentation.

in reply to: ↑ 11   Changed 2 years ago by anonymous

Replying to anonymous:

Ultimately its a job for the filesystem, not for each and every program to do their own thing. ext4 is supposed to have both persistent preallocation and online defragmentation.

Being bug-free is a job for the filesystem too. Nevertheless safe_sync exists.

  Changed 2 years ago by anonymous

safe_sync works around a design flaw in mmap, not a filesystem bug. The problem is that a process, by design, may mmap sparse files even when writing to them would cause the FS to run out of space.

follow-up: ↓ 15   Changed 19 months ago by anonymous

It looks like on fat on Linux or on any filesystem on OS X there is no option to not preallocate. Opening a new torrent causes a surge of disk writes and the and the available disk space shrinks by the size of the torrent.

in reply to: ↑ 14   Changed 19 months ago by anonymous

Replying to anonymous:

It looks like on fat on Linux or on any filesystem on OS X there is no option to not preallocate.

Of course. That's because FAT doesn't support sparse files at all.

in reply to: ↑ 7   Changed 19 months ago by parafin

Replying to loswillios:

there's a patch available:  http://pastebin.ca/raw/885247 you need a kernel ≥ 2.6.23. make sure to use it only on XFS filesystems thanks to void

To use that patch you'll need to add --with-xfs option to configure. But there is another way - you can apply the same patch and configure libtorrent with --with-posix-fallocate (and without --with-xfs). For this to work the same as --with-xfs you'll need patched glibc (I don't think vanilla glibc has this feature) - gentoo version >=2.7-r2 would be fine, probably other distros include patch for fallocate too, kernel headers >= 2.6.23 and kernel version >= 2.6.26. This way preallocation should work on XFS and ext4 + it probably won't hurt with other FS. posix_fallocate function from glibc previously just did dd if=/dev/zero of=/file, but now (with patched glibc) it can use kernel syscall fallocate, which was introduced in 2.6.23. It needs more testing of course, but it works fine for me with XFS. I think mentioned above patch should be included in main tree, it really helps with fragmentation.

  Changed 17 months ago by anonymous

Since posix-fallocate is posix, I think it should be enabled by default when available.

in reply to: ↑ 11   Changed 16 months ago by anonymous

Replying to anonymous:

Ultimately its a job for the filesystem, not for each and every program to do their own thing. ext4 is supposed to have both persistent preallocation and online defragmentation.

From a code design viewpoint I totally agree. However, preallocation in a number of scenarios is not only a welcome feature; rather a needed one. Example: in low-end NAS devices (i.e. Maxtor shared storage and the likes of it) transferring files downloaded via rtorrent, to another device in the network is a disk- and not a network-limited operation! That is, transfer is around 1-2 Mbytes/sec, whereas for unfragmented files performance is greater by an order of magnitude (10-17 Mbytes/sec).

I do love rtorrent, but since I'm transferring a lot these downloaded files, this is a major problem for me. Please do consider adding this into a future version.

  Changed 15 months ago by solsTiCe

at least one can defrag quite effectivly an xfs mounted filesystem xfs_fsr -v /device not sure how well it works while you download a torrent. i have not tried that ! better wait 'til it's finished.

  Changed 15 months ago by rakshasa

Using posix_fallocate by default is doomed to fuck up things a lot. I've however been considering adding optional support for this using threads to avoid the blocking caused by the call on most systems.

However, it's not a priority atm.

in reply to: ↑ 10 ; follow-up: ↓ 22   Changed 11 months ago by anonymous

Replying to anonymous:

Replying to anonymous:

The following  workaround zeroes every file and thus should leave no holes.


That one was a bit `leaky' -- please use  this patch instead.

Hi. The pastebin link does not work. Would a repost be possible?

I'm loving rtorrent but I miss this feature.

in reply to: ↑ 21 ; follow-up: ↓ 23   Changed 11 months ago by anonymous

Replying to anonymous:

Replying to anonymous:

Replying to anonymous:

The following  workaround zeroes every file and thus should leave no holes.


That one was a bit `leaky' -- please use  this patch instead.

Hi. The pastebin link does not work. Would a repost be possible?

Sure:

 http://pastebin.ca/894036

in reply to: ↑ 22 ; follow-up: ↓ 24   Changed 11 months ago by anonymous

Replying to anonymous:

Replying to anonymous:

Replying to anonymous:

Replying to anonymous:

The following  workaround zeroes every file and thus should leave no holes.


That one was a bit `leaky' -- please use  this patch instead.

Hi. The pastebin link does not work. Would a repost be possible?

Sure:  http://pastebin.ca/894036

BTW: I never tried this patch with libtorrent 0.12.* since I'm still using 0.11.9.

in reply to: ↑ 23 ; follow-up: ↓ 29   Changed 11 months ago by anonymous

Replying to anonymous:

Replying to anonymous:

Replying to anonymous:

Replying to anonymous:

Replying to anonymous:

The following  workaround zeroes every file and thus should leave no holes.


That one was a bit `leaky' -- please use  this patch instead.

Hi. The pastebin link does not work. Would a repost be possible?

Sure:  http://pastebin.ca/894036

BTW: I never tried this patch with libtorrent 0.12.* since I'm still using 0.11.9.

Thank you very much for the patch. I can confirm that it works with 0.12.4. It zeroes all the files when a download is started, producing files with a much smaller number of extents.

One question, must CHUNK_SIZE has the same value as the block size used in the partition hosting the downloaded files? Checked mine and it had a block size of 4096 bytes, so I left the patch untouched.

follow-up: ↓ 30   Changed 11 months ago by anonymous

I've found a "problem" with the patch. Sometimes I download only part of a torrent. I add the torrent in stopped state and set priority to off for the files I don't want. With the patch, when the torrent is started all files are allocated, including those with priority set to off. This can be a problem is the files you don't want are big.

It would be better to pre allocate only the ones with normal or high priority and create the ones with off priority as sparse (they will only get data written in case there is a chunk of the torrent that includes them and some file we want to download, so fragmentation is not an issue for them). If later we decide to download a file with off priority, it will be a sparse file and get fragmentated (I never do that, anyway).

follow-up: ↓ 27   Changed 11 months ago by anonymous

Try the patch from ticket #888, maybe that will help here too.

in reply to: ↑ 26 ; follow-up: ↓ 44   Changed 11 months ago by anonymous

Replying to anonymous:

Try the patch from ticket #888, maybe that will help here too.

Thanks, but that won't do it (for what I want). Imagine a torrent with two big files from which we only want the first one. Torrents can have chunks than correspond to more than one file (the end of one file and the beginning of another file). As the client downloads full torrent chunks, a few bytes from the second file will be downloaded, causing the second big file to be allocated.

I made a path that does what I wanted: When starting a file it will pre allocate space for the files with normal or high priority. Files with off priority (those I don't want now or in the future) will be allocated as sparse. It uses the allocation code from one of the patches above. To be really good it should allocate in it's own thread, but I just browsed the code a bit so it'll do for now.

 http://pastebin.ca/1389768

  Changed 11 months ago by anonymous

this would also speed up hashing of files..

in reply to: ↑ 24   Changed 11 months ago by anonymous

Replying to anonymous:

One question, must CHUNK_SIZE has the same value as the block size used in the partition hosting the downloaded files?

No, it has nothing to do with the blocksize of your fs.

in reply to: ↑ 25   Changed 11 months ago by anonymous

Replying to anonymous:

I've found a "problem" with the patch. Sometimes I download only part of a torrent. I add the torrent in stopped state and set priority to off for the files I don't want. With the patch, when the torrent is started all files are allocated, including those with priority set to off. This can be a problem is the files you don't want are big. It would be better to pre allocate only the ones with normal or high priority and create the ones with off priority as sparse (they will only get data written in case there is a chunk of the torrent that includes them and some file we want to download, so fragmentation is not an issue for them). If later we decide to download a file with off priority, it will be a sparse file and get fragmentated (I never do that, anyway).

This patch is really just a workaround for ecryptfs users. I've spent about 20 minutes reading the code before I wrote it..

follow-up: ↓ 32   Changed 10 months ago by anonymous

Like all those oldie adventure games: "You wait. Time passes" :)

rakshasa, this ticket is certainly not an rtorrent defect! It's more of an enhancement request. One that most NAS/low-powered-rig users on ext3 filesystems might need, but an enhancement only nevertheless.

Clean code is a good thing. A small number of options is also a good thing. But would you please make an exception for this one? Perhaps either preallocate by default or allow the user to control such behavior by rtorrent.rc configuration option?

Whatever you may decide, I must stay that I find rtorrent to be a monster of stability. It is simply impossible to crash/break this thing! Congrats on a well written application mate!

in reply to: ↑ 31   Changed 10 months ago by the_other_anonymous

Replying to anonymous:

Like all those oldie adventure games: "You wait. Time passes" :) rakshasa, this ticket is certainly not an rtorrent defect! It's more of an enhancement request. One that most NAS/low-powered-rig users on ext3 filesystems might need, but an enhancement only nevertheless. Clean code is a good thing. A small number of options is also a good thing. But would you please make an exception for this one? Perhaps either preallocate by default or allow the user to control such behavior by rtorrent.rc configuration option? Whatever you may decide, I must stay that I find rtorrent to be a monster of stability. It is simply impossible to crash/break this thing! Congrats on a well written application mate!

Amen!

follow-up: ↓ 34   Changed 10 months ago by rakshasa

It is pitch black. You are likely to be eaten by a grue.

in reply to: ↑ 33   Changed 10 months ago by the_other_anonymous

Replying to rakshasa:

It is pitch black. You are likely to be eaten by a grue.

lol!

follow-ups: ↓ 37 ↓ 45   Changed 10 months ago by rakshasa

Latest commit contains some code for this... Can't test it since MacOSX doesn't support posix_fallocate. To try it out, configure with '--with-posix-fallocate' and set 'system.file_allocate.set = yes'.

When starting large torrents, it should hang for a while allocating the files on disk.

  Changed 10 months ago by the _grue_eaten_anonymous

<OT>Glad to see Zork had its impact alright ;) As for me, I was a hobbit/spectrum guy back then. Oh well, mindless chatter of mine</ot>

Thank you for considering this enhancement rakhasa! Will be awaiting for this feature to hit the optware distributions with great anticipation.

in reply to: ↑ 35   Changed 9 months ago by the_other_anonymous

Replying to rakshasa:

Latest commit contains some code for this... Can't test it since MacOSX doesn't support posix_fallocate. To try it out, configure with '--with-posix-fallocate' and set 'system.file_allocate.set = yes'. When starting large torrents, it should hang for a while allocating the files on disk.

Please forgive my slowness but this will only work on ext4, right?

  Changed 9 months ago by anonymous

 http://www.linux-mag.com/id/7272/2/

The fallocate() system call is not in most glibc’s as of this writing, but posix_fallocate() is; the problem with posix_fallocate is that if you use it on ext3, it will attempt to emulate fallocate() by writing all zeros to the file. This emulation step can be very slow, and may come as a surprise to the application that was expecting posix_fallocate() to be quick; the fallocate() system call has the advantage that if it is not present, it will fail, and the application can then decide on its own what it wants to do.

All right!

  Changed 9 months ago by the_other_anonymous

Ugh, sorry, again:

The fallocate() system call is not in most glibc’s as of this writing,
but posix_fallocate() is; the problem with posix_fallocate is that if
you use it on ext3, it will attempt to emulate fallocate() by writing
all zeros to the file. This emulation step can be very slow, and may
come as a surprise to the application that was expecting
posix_fallocate() to be quick; the fallocate() system call has the
advantage that if it is not present, it will fail, and the application
can then decide on its own what it wants to do.

  Changed 9 months ago by the_other_anonymous

Well, done! It works as expected (on encfs; which like ecryptfs, cannot handle sparse files).

The write-pattern is a bit weird, though:

http://img514.imageshack.us/img514/4016/69647297.jpg

But that's glibc, I guess.

  Changed 9 months ago by rakshasa

  • status changed from new to closed
  • resolution set to fixed

Considering this fixed.

  Changed 6 months ago by anonymous

rakhasha,

the change just hit the optware distro. Awesome work mate, the difference is really *vast*!

Hats of to you, you rock!

  Changed 6 months ago by anonymous

Thanks for this feature, I really missed it.

Just one comment for your consideration: If we download only a set of files of a torrent it might be better to posix_fallocate only the files that don't have off priority. Right now, chunks that span over two files might produce the allocation of a file with priority set to off. This might be a problem with big files (in terms of disk space used, I mean), for example a torrent with two DVDs from which we only want one.

in reply to: ↑ 27   Changed 2 months ago by another_anonymous

Replying to anonymous:

I made a path that does what I wanted: When starting a file it will pre allocate space for the files with normal or high priority. Files with off priority (those I don't want now or in the future) will be allocated as sparse. It uses the allocation code from one of the patches above. To be really good it should allocate in it's own thread, but I just browsed the code a bit so it'll do for now.  http://pastebin.ca/1389768

This pastebin appears to have expired. Could you re-post please?

in reply to: ↑ 35   Changed 7 weeks ago by f.dumas@ellis.siteparc.fr

Replying to rakshasa:

Latest commit contains some code for this... Can't test it since MacOSX doesn't support posix_fallocate. To try it out, configure with '--with-posix-fallocate' and set 'system.file_allocate.set = yes'. When starting large torrents, it should hang for a while allocating the files on disk.


Hi,

I am running version 0.8.2 on Ubuntu. I installed the binary from Ubuntu's repositories. Then,

1- added the line 'system.file_allocate.set = yes' to ~/.rtorrent.rc
2- called 'rtorrent --with-posix-fallocate' from bash
3- got the following error message :

rtorrent: Error in option file: ~/.rtorrent.rc:37: Command "system.file_allocate.set" does not exist.

4- got an alternate error, if line 'system.file_allocate.set = yes' is commented:

rtorrent: invalid option -- '-'
rtorrent: Invalid/unknown option flag "--". See rtorrent -h for more information.

How to use parameter '--with-posix-fallocate' and setting 'system.file_allocate.set = yes' rakshasa wrote about ?

Thanks.

follow-up: ↓ 47   Changed 7 weeks ago by anonymous

Use a more recent version. 0.8.2 is ancient (but upgrade libtorrent first). The --with-posix-fallocate is not a runtime option, it's a configure option. So you'll have to compile it yourself if the repository version is compiled without it.

in reply to: ↑ 46   Changed 7 weeks ago by f.dumas@ellis.siteparc.fr

Thanks.

Should I understand the fix was added only to release 0.8.5 ?

[Libtorrent-devel] LibTorrent 0.12.5 and rTorrent 0.8.5 released
Added support for using posix_fallocate on newly resized files.

Is "posix_fallocate on newly resized files" the same issue as the one covered by ticket #460 ?

  Changed 7 weeks ago by Alexander

I've got it working under Debian Lenny. Glibc was recent enough. I used rtorrent and libtorrent from unstable (0.8.6 and 0.11.6). Libtorrent needed recompile with --with-posix-fallocate, rtorrent package didn't.

If filesystem is ext3 rtorrent freezes while preallocation is running, since ext3 doesn't support extents and preallocation and glibc call falls back to writing zeros to the file. I reformatted partition as xfs, fallocate() returns immidiately and after downloading and xfs_bmap shows that file consists from one large extent. Perfect.

By the way, I guess that if you don't want to recompile, you may try to mount xfs with allocsize=16m, it should help with fragmentation. That would work if you don't have a lot small files on that partition.

  Changed 4 weeks ago by anonymous

Hi. This was working for me on ext3 (zeroing files) and now I've moved to ext4 and rtorrent is producing files with too many extents.

With posix_fallocate active on an ext3 filesystem files are initialized to zeros and have a reduced number of extents. For example: 1GB file, ~300 extents.

With posix_fallocate active on an ext4 filesystem files are initialized to their target size in no time (du for a file that just has started gives the total size) but the number of extents when the file has finished downloading is much bigger than with ext3. Example: 3144 extents for a 260MB file. The same file downloaded with ktorrent (which also uses posix_fallocate) resulted in 562 extents. A smaller number, but too big for such a small file.

Extents are working OK on the ext4 partition. For instance, copying a 1GB file with Dolphin to the ext4 partiton produces a file with only 29 extents.

All test were done on a filesystem with a lot of free disk space. Anyone experiencing the same behaviour? Any ideas why posix_fallocate is producing such a big number of extetns?

  Changed 4 weeks ago by rakshasa

Sounds like a problem for the LKML. If the kernel does stupid things there's little we can do.

  Changed 4 weeks ago by anonymous

Thanks for the quick answer rakshasa.

I've been testing this issue a bit more. It looks like the problem is related with the way in which the file is filled after the posix_fallocate call. The allocation works and produces a file with the right size and an appropriate number of extents, but the advanced approach that rtorrent uses to write data produces a file that does not respect the original allocation of the file, ending with a very big number of extents and a very fragmentated file. ktorrent uses a more traditional and less sophisticated method for writting the files and they respect the allocated extends with posix_fallocate.

So there may be something wrong in ext4 management that ignores the initial extents allocated and reallocates the file while it is written to disk during download as if no posix_fallocate was ever called for rtorrent code (while it works with other simpler code like ktorrent and file copy). I tested with various ext4 options, particularily with delayed allocation on and off, both with the same result.

I'm not posting in the LKML because my knowledge is a bit limited for the kind of technicalities that are discussed there. I guess it's back to ext3 for me.

  Changed 4 weeks ago by anonymous

Hi again. Just wanted to add that I've tested transmission (configured with preallocation mode 1) it works with ext4. Files are preallocated and they end up with a very small number of extents.

So ext4 preallocation works for ktorrent and transmission but not for rtorrent. With a quick browse of the code I've noticed ktorrent uses posix_fallocate64 (I'm running x86_64) instead of posix_fallocate and transmission uses posix_fallocate in combination with posix_fadvise (to set an allocation strategy indicating we are interested in a sequential file). Does this ring any bell for someone or the problem in ext4 may only be triggered with the advanced and efficient file handling that rtorrent does?

Thanks!

  Changed 4 weeks ago by rakshasa

  • status changed from closed to reopened
  • resolution fixed deleted

The posix_fallocate and posix_fallocate64 calls should be the same. Looking closer at the documentation for the call it seems to also change the file size, which I do prior with ftruncate.

So might be that ext4 doesn't do what it is supposed to when we call ftruncate+fallocate or something...

Changed 4 weeks ago by rakshasa

Try this patch

  Changed 4 weeks ago by anonymous

Patch tested (rtorrent 0.8.6, libtorrent 0.12.6 compiled with --with-posix-fallocate, gentoo Linux unstable branch, kernel 2.6.31, x86_64). Didn't work for me. Downloaded a 185 MB torrent. File was preallocated as 185MB with two extents just after the posix_fallocate call. At the end of the download file had 949 extents.

If you need any other test I'll be glad to do it.

  Changed 4 weeks ago by rakshasa

Run the above program on the file you downloaded, then send me the output by email. ( sundell.software@gmail.com)

Changed 4 weeks ago by rakshasa

File extents, requires root on linux

  Changed 4 weeks ago by rakshasa

  • status changed from reopened to closed
  • resolution set to fixed
Output of the new version:

Opening file '/mnt/d1/part1.wmv'.                                       
Block size: 4096                                                        
File size:  196571350                                                   
Block/page count: 47992                                                 
   1: log        0   bb78 phys     8800   bb77    

Size of the file is 187.5 MB and filefrag reports 1505 extents.

Ok, seems this is a problem with ext4 not merging file extents when a hole is filled between two extents. So file itself is contiguous, but filefrag gives a large number as we write pieces to disk randomly.

  Changed 11 hours ago by sandeen@sandeen.net

not sure the test is working quite right.

It's apparently broken at least for some glibcs ... doesn' find the FALLOC_FL_KEEP_SIZE flag definitiion used in the configure test; on my box anyway that's only in <linux/falloc.h>

the library code doesn't seem to use the flag either though, so easiest fix might be to just change the flag in the test to "0"

Add/Change #460 (Option to pre allocate files.)

Author


E-mail address and user name can be saved in the Preferences.


Action
as closed
Next status will be 'reopened'
 
Note: See TracTickets for help on using tickets.