I’m trying to build a sitemap for http://www.articlesnatch.com
realizing that there are a lot of frivelous pages – i have set the filters to only be the category and article pages. hwever – this is still about 140,000 pages.
I changed my java parameters to ‘-xmx512m’ and it still says that there is oonly 96 megs of ram available…
Should i leave the ‘ marks in there? I have tried it both with and without the ‘ marks.
I love this tool and your others – and i tell otehrs about them as well.

Hello and welcome to the Forum!
It should be:
-Xmx512m
Try making the first x upper case. No ‘ or "
Also, you should be using the Webmaster Tool rather than the Sitemap Generator, if you are not doing so now.
Nice domain name BTY! Nice site as well, good job!
Regards,
Jim.
ok – now i have it correct. do i need to completely close out of the browser to get it to apply?
I’m building a sitemap now for another domain of mine… so far i’m up to 60,000 URLS… i hope it doens’t crash…
it worked!!!
I guess my only question is now… if i hit save project – then i open it again – it doesn’t load anything.
Glad to hear it worked!
Right now, you can’t save your project parameters out, so every time you need to create a sitemap, you’ll have to add in your exclude filters – not a lot of work, but still can be a pain for large excludes.
I save out my excludes in a text file and then cut and paste.
It’s a bug I am working on – just a matter of time.
Regards,
Jim.
Thats what i have done as well. Thanks for the heads up – I just wanted to make sure it wasn’t me.
Thanks for the help and the awesome tool. If I can get this to work for the article directory – that will be huge – I have been looking for something for the last year to do it…
It will work for the article directory, it should have no problem spidering 140,000 pages. It will take a little while, but with the memory increase (and increase threads to 9), you should be just fine.
Regards,
Jim.
I’m gonna ask a stupid question..
I’m trying to build a sitemap for anther site – however, the tool finds pages that don’t exist – then because the displays the homepage as the 404 page – it thinks that all the content it sees belongs in the new directory – and basically begins to spider all the pages again. Anyway to stop it?
the domain is mjesales.com/
Not a stupid question – perfectly valid!
Can you give me a few examples of the pages it found that do not exist?
Regards,
Jim.
I got it to work wahoo!!!
I just found the files that were messing i up and ignored them… thanks for your help!
We aim to please
If you’re happy and you know it … – pass the word
Regards,
Jim.