WebSecurity.mobi

Focused legacy troubleshooting archive

Curated guide

Sitemap Generator Java Memory Error

Fix Java memory and runtime failures that stop the AuditMyPC XML Sitemap Generator before it finishes crawling a site.

Problem Summary

This guide combines the archive threads where the sitemap generator stalled, opened a blank window, threw out-of-memory messages, or silently did nothing after the user clicked the Sitemap tool. The exact wording changed from thread to thread, but the pattern was consistent: the crawl failed before the site could be processed cleanly.

The archive evidence shows two broad versions of the same problem. Some users never got any visible activity at all. Others could start a crawl, but larger sites slowed to a crawl, exhausted memory, or failed once the queue grew into the tens of thousands of URLs. The value of the page today is in diagnosing those old failure patterns, not in reviving the original applet model for current browsers.

Comment Highlights

  • One user described the failure bluntly: no URLs appeared, no error message showed up, and there was no activity at all after launching the tool.
  • A first-time user asked whether increasing Java memory would fix the issue, which captures how confusing the tool's runtime settings were for non-technical users.
  • Another thread reported that setting Java memory to 256 MB made the crawl 'fly', which confirms that some failures really were local runtime limits rather than bad crawl logic.
  • On larger sites, users reported the generator grinding to a halt above roughly 40,000 URLs, while another case still failed even after the user pushed the Java setting close to 1 GB.

Likely Causes

  • The Java runtime ceiling was too low for the amount of pages, images, or duplicate URLs the tool was trying to hold in memory.
  • The local Java install or browser runtime was unstable, out of date, or inconsistent enough that the applet opened but did not really start working.
  • The site was simply too large or too noisy for one pass, especially when non-essential URLs had not been filtered out first.
  • Background programs or low available memory on the local machine reduced how much headroom the crawler had before failing.

What Still Applies

  • Reduce the crawl before you increase the runtime. Excluding images, duplicate paths, search results, and other low-value URLs helps more than blindly giving the crawler more memory.
  • If a crawl tool fails before it starts, confirm whether the failure is local. Try a second browser, a second machine, or a smaller test target before assuming the site itself is broken. If the crawl starts but then dies or stalls after a shallow pass, compare the pattern with XML Sitemap Generator Not Reading Past First Page.
  • Large crawl jobs need realistic limits. Even modern crawlers can fail or slow down badly when the URL space is much larger than the user realizes. The large-site failure mode overlaps with the archive patterns in Google Sitemap Restrictions for Large Sites.
  • When you are diagnosing the preserved legacy applet itself, a runtime reset or reinstall can explain a historical failure pattern. For a current crawler stack, the durable lesson is scope control and environment diagnosis, not the old Java setup steps.

Legacy Notes

The Java applet workflow in these threads is legacy. Modern browsers removed that model years ago, so the exact browser and applet steps should be treated as historical context, not current support guidance or a modern browser fix. Do not treat the archive as a reason to re-enable obsolete browser components on a current system.

What remains useful is the diagnostic pattern: memory errors usually mean the crawl scope is too broad, the runtime is unstable, or both. The safest modern takeaway is to shrink the job and inspect the environment, not to assume an old Java tweak is the right answer on a current system.

Related Guides

Parent Hub

hub

XML Sitemap Generator Help

Legacy support hub for the AuditMyPC XML Sitemap Generator, including crawl limits, Java errors, odd exports, and duplicate URL problems.