The old "index of" or "parent directory" trick is now used by a lot of websites to get traffic, even if you add -html -htm -shtml -phtml -php -buy -aspx -jsp -asp -cgi -pdf -ftp, what I just tested for the google music query (first one). A lot of sites are even no open indexes. Some of them are fake open indexes or some of them are password protected or redirect to spam links.
I think it is not physically possible to exclude the spam sites because the query becomes to large. The only way to check this is to process the url itself which should then be done on the server or with a client sided script. However there is still a chance to find material in this simple way.