NES restrictions can be summed up thus: 3 color sprites, 4 if you made use of the entire square
Nope. It's 3 colors (plus transparency) per hardware sprite always. As a matter of fact, filling the entire square with non transparency is even worse.
Super Bat Puncher takes advantage of its mostly black backgrounds and uses transparency to use black in a sprite. (Like for the cat's eyes. If you look closely, his eyes change color if they're overlapping a non black part of the background.)
6 colors if you are willing to settle for having either only two sprites onscreen or loads of flicker + major slowdown.
Nope. As said above, it's 3 colors per hardware sprite always. Even assuming you're talking about sprite layering to get said 6, that would allow you to get up to 12 colors in the same area using only sprites.
Sprite layering doesn't limit your use of sprites to two (You still get 64 on screen), nor contribute any more to slowdown than just having a thing made of two non layered sprites (actually, it's probably usually less, but it depends on how you do it). The difference between layering one sprite and not adding an extra sprite is 0.15% of the CPU time available in a frame (44 cycles out of ~29780) assuming it's directly overlaid over another previously drawn sprite. Even if you don't want to make that assumption, or assuming I'm missing something else, that time doubled is still slightly less than a third of ONE PERCENT of the time you have in a frame.
True about flicker, though, but two sprites vs one, or even four sprites vs two doesn't make that big an impact.