Check Content directory for new files while running

Author
Message
JAMITIN
Offline
Junior Member
Posts: 11
Joined: Thu Dec 15, 2011 11:06 pm
IRCBot Version I Use: v5
IRCBot Platform: FreeBSD

Check Content directory for new files while running

Post by JAMITIN »

Having a way to check just new files (periodically or by command), as opposed to reloading the entire directory would be useful if somebody wants to include a file uploader on their site for users to upload (straight to the content directory).

autodj-reload causes the stream to disconnect, so perhaps just having that run independently somehow would be sufficient.

Indy
User avatar
Offline
Site Admin
Posts: 465
Joined: Thu Oct 16, 2008 1:58 pm
IRCBot Version I Use: v5
IRCBot Platform: Linux/Ubuntu
Contact:

Re: Check Content directory for new files while running

Post by Indy »

To find new files it has to rescan the whole directory (just like any program). There isn't currently a timed option to rescan files but you can use !autodj-reload to do it manually any time; I should be able to add a timed option as well. Until then, one trick you can do is create an entry in your scheduler.conf to request a file that doesn't exist; that would rescan for files however often you made the request fire off.

The stream can disconnect during a rescan if the current song stops playing while the scan is in progress, but it doesn't force the stream to disconnect as a rule.
You can also contact me via email or instant messenger, links are at the bottom of this post.

JAMITIN
Offline
Junior Member
Posts: 11
Joined: Thu Dec 15, 2011 11:06 pm
IRCBot Version I Use: v5
IRCBot Platform: FreeBSD

Re: Check Content directory for new files while running

Post by JAMITIN »

Ah, good to know about the scheduler trick. I just tried running !autodj-reload a few times, I didn't realize it was a coincidence that it was disconnecting before. D'oh. I'm running a few thousand files on a with Content pointing to a 20mbit nfs share, so yeah. The reload in ircbot5 is fast nonetheless. :D

Indy
User avatar
Offline
Site Admin
Posts: 465
Joined: Thu Oct 16, 2008 1:58 pm
IRCBot Version I Use: v5
IRCBot Platform: Linux/Ubuntu
Contact:

Re: Check Content directory for new files while running

Post by Indy »

How long does it usually take to reload over NFS? I've always wondered how slow it would be scanning a remote filesystem.
You can also contact me via email or instant messenger, links are at the bottom of this post.

JAMITIN
Offline
Junior Member
Posts: 11
Joined: Thu Dec 15, 2011 11:06 pm
IRCBot Version I Use: v5
IRCBot Platform: FreeBSD

Re: Check Content directory for new files while running

Post by JAMITIN »

It depends, on ircbot5, on a first load, it's 5 minutes every 1k files. Then I can do !autodj-reload and it'll be less than a minute total. Over the summer, it was more like 45 minutes every startup (for 40k files) on ircbot4 thanks to mysql queue. I'll be able to post a more coherent benchmark in a couple days after I organize/transfer some files.

Indy
User avatar
Offline
Site Admin
Posts: 465
Joined: Thu Oct 16, 2008 1:58 pm
IRCBot Version I Use: v5
IRCBot Platform: Linux/Ubuntu
Contact:

Re: Check Content directory for new files while running

Post by Indy »

As part of the file scan it calls stat() on each file to get it's last modified time and file size to determine if the bot should re-read it's ID3 tags, I imagine that would take a lot of time with thousands of files over a network link. I could probably make an option to read tags for new files only and that *should* make it go a lot faster, at the cost of if you update any of your ID3 tags or files the metadata won't be updated in the bot unless you do a full re-scan.
You can also contact me via email or instant messenger, links are at the bottom of this post.

JAMITIN
Offline
Junior Member
Posts: 11
Joined: Thu Dec 15, 2011 11:06 pm
IRCBot Version I Use: v5
IRCBot Platform: FreeBSD

Re: Check Content directory for new files while running

Post by JAMITIN »

Ah, that sounds great.

JAMITIN
Offline
Junior Member
Posts: 11
Joined: Thu Dec 15, 2011 11:06 pm
IRCBot Version I Use: v5
IRCBot Platform: FreeBSD

Re: Check Content directory for new files while running

Post by JAMITIN »

Alright, here are the benchmarks.

With queue_mysql and nfs, after wiping ircbot.db and the IRCBot mysql table, file checking for 31827 files (with metadata) takes 3 hours 16 minutes. It's about the same as if it checked locally with queue_mysql, while it's faster (locally and nfs) with queue_memory. Restarting IRCBot with 3k files added took 32 minutes, autodj-reload right after that took 28 minutes.
[15:14:32] <JAM> autodj-reload
[15:14:32] <Touhou_Radio> Reloading schedule...
[15:14:32] <Touhou_Radio> Beginning re-queue...
[15:42:19] <Touhou_Radio> Re-queue complete! (Songs: 38543)

Indy
User avatar
Offline
Site Admin
Posts: 465
Joined: Thu Oct 16, 2008 1:58 pm
IRCBot Version I Use: v5
IRCBot Platform: Linux/Ubuntu
Contact:

Re: Check Content directory for new files while running

Post by Indy »

A new build is up now with the new OnlyScanNewFiles option.

It goes in AutoDJ/Options:
OnlyScanNewFiles 1

That will remove the extra stat() call and maybe make it faster a bit? (I tested on a local hard drive with 5,000 files and the re-scans finished in 8 seconds with or without the setting, so I'm not sure how much help it will be.)
You can also contact me via email or instant messenger, links are at the bottom of this post.

JAMITIN
Offline
Junior Member
Posts: 11
Joined: Thu Dec 15, 2011 11:06 pm
IRCBot Version I Use: v5
IRCBot Platform: FreeBSD

Re: Check Content directory for new files while running

Post by JAMITIN »

Awesome, thanks.

Post Reply