Hello I like the sofware, but it really seems to struggle on large site.
So in an effort to overcome this I did some URL rewrites and specifically aimed the crawler at the forum first.
I get the following issues.
1. It will run then at around 4k urls threads can start to hang and rather than 5 or 7 threads being proccessed it’s 2 while the other look like they are still streaming but no KB being transfered.
2. I know in the forum alone it should produce over 100k but it if lucky I might get 10k.
3. Memory isn’t an issue I’ve a quad processor and with 4GB ram and I assigned 1gb to java, so it shouldn’t get any memory issues.
4. It would be nice to see a feature that if threads hang you can kill them and reset them. Even if I stop the crawler this has no effect.
Don’t get me wrong I think the tool is fantastic just wish I would work much better. If you’ve any suggestions please advise. The URL is www.rcheliaddict.co.uk
My other option is to write up some .php and generate my own site map tapped direct to the db and output the .xml’s. Time is the issue and I’d rather use a reliable system than have to start on a code.

Comments