![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
|||||||
| FAQ | Members List | Search | Today's Posts | Mark Forums Read |
| Webmaster Tool - Sitemap Generator Questions about building a sitemap, including the XML Sitemap, HTML sitemap, Google and Yahoo! Sitemaps. |
|
|
|
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
I'm getting multiple results for some pages, with different select, sort, and/or session id parameters.
Example: 2550 http://www.avalanche-center.org/Education/Courses/NewMexico.php?tfm_order=ASC&tfm_orderby=date 2551 http://www.avalanche-center.org/Education/Courses/NewMexico.php?tfm_order=ASC&tfm_orderby=state 2552 http://www.avalanche-center.org/Education/Courses/NewMexico.php?tfm_order=ASC&tfm_orderby=location 2553 http://www.avalanche-center.org/Education/Courses/NewMexico.php?tfm_order=ASC&tfm_orderby=course 2554 http://www.avalanche-center.org/Education/Courses/NewMexico.php?tfm_order=ASC&tfm_orderby=sponsor 2555 http://www.avalanche-center.org/Education/Courses/NewMexico.php?tfm_order=ASC&tfm_orderby=cost Is there a way to get the tool to ignore anything after a question mark? And thus combine entries like the five above into a single one? Thanks Jim |
|
|||
|
Ok, I found a way to do this with exclusions. It would be nice if this was an option in the future though - a checkbox on the Settings Page perhaps.
I used a few different ? exclusions: *php?* */?PHP* */?s* This seems to work ok. There is still one version of the pages in question without any parameters. I was afraid it might eliminate the page entirely but it didn't seem to. As far as I can tell it didn't eliminate anything else that I didn't intend to eliminate. Its a little hard to tell with so many pages to skim through, but it looks ok. Passing paramters after a ? seems very common and it's been an issue with other website tools I use too. Which is why a checkbox option would be nice to have. |
|
|||
|
I'm thinking that if a checkbox was checked then anything after a ? in a url would be ignored. So all the ones listed above would then be the same and listed as one page.
I don't think a ? should appear in a url unless it's before parameters such as sort keys, etc. Although I'm not 100% certain. Right now I'm excluding the urls with the ? in them and counting on at least one version of the page showing up without the ?. I think it would be more reliably thorough if these pages were not excluded, but condensed into a single page when all before the ? is the same. This way it shouldn't miss a page that only shows up with various parameters, and for some reason doesn't show up at all without a ?. Hopefully that makes sense. I've seen similar behavior (listing multiple variations) in an old version of linkscan I use (great for checking links, doesnt do sitemaps and isn't free so my version is very old now). The urls above are the result of php code on the page to select and sort info from a database listing. The other exclusion rules come from scripts for reciprocal links and for something else, these are larger more complicated scripts I installed but didn't write. Jim |
|
|||
|
an edit excluded parameter list would solve the problem quite simply.
standard start of parameter is ? and standard join is & always ending in = then the variable after that. so with that in mind if you can add an edit parameter list where you enter the value/word that is between ? and = and it will be excluded from the sitemap. The problem with the above solution would be ecommerce and csm websites as these use the parameter ?page=newproduct "which ever page it is" all these pages that make up the majority of the pages in these sites would be excluded. Now I am not sure if you need them in your sitemap but if you do then that would be a big problem and there may be similar pronplems on other types of sites. hope that helps |
![]() |
| Thread Tools | |
| Display Modes | |
|
|