thanks for the replies so far. lots of things to digest

it always seems to require custom code, but it can be done.
To be fair, i've been dabbling with different engines lately, and it seems all of them require custom code when it comes to tilemap rendering. The support for that is overwhelmingly bad, be it ue4 (although i havent looked at that for a year now) unity, which has no tilemap support at all (unless you're licensed and get into the unity2dalpha, which looks alright so far) or godot (which is just hilariously bad in that regard). Tilemaps are such an easy problem to solve, and it confuses me that some engines just totally neglect it. (to be fair, flashpunk's tilemaps were alright for a base implementation, and since you got the source they were easily extendable). So yeah, writing custom code is always necessary, if you want flexibility and performance, at least in the current day engines for 2d.
The reason it isn't done more often is because it creates a performance hit [...]
Performance impact shouldnt be too harmful on any system. even if you have a huge map with lots of objects, you can easily cull everything offscreen, and y-sort everything else. unless you have thousands of objects (like every single strand of grass being standalone) it should be more than fine. Picking up on that you can also just shove every object into a single vertex buffer to get done with them in one draw call. It requires some technical finesse, true, it's not outlandishly difficult. But you're right about the level of complexity both on the codebase as well as the level design. Although the latter part could be solved by having good tools for it.
World tiles can be animated, even dynamically (e.g. to react to wind caused by a player's attack) [...]
Animated tiles are always nice, but i'm to be honest, i'm still sort of struggling for a decent implementation of those, both on the editor-side (how to define an animationloop of tiles), on the data side (how to store the animation loop tile id's and possibly times) and also the rendering (change uv's on the shader? rebuild geometry ever animation frame?). but that's besides the point.
What i was thinking about would be like a patch of grass swaying to the side when you walk through it. some high grass strands (foliage particles as done in 3d engines) to have the character sink into it instead of standing on top of it, having their feet occluded. have wind blow in different directions instead of a static wind loop on tiles. having high grass cut with a sword will cut down patches instead of replacing a tile with short grass. things like that come to mind to build a more dynamic world while remaining intrinsically pixel based.
One of the few advantages PixelArt can still claim, is how well it works along tiles, and how extremely fast and reliable you can construct worlds with that [...]
Yes, definitely. and i'm not trying to abolish tiles entirely. It is a solid foundation for floors, geometry, architecture, environments in general. I'm just considering another layer on top of that, something that might push pixel games from having mostly rigid and stiff world to something that reacts, something that seems to exists, as opposed to something you simply walk on. (lighting is another area that helps with that, if you want we can add this topic to the discussion too, since light often is not (pixel)gridbased

)
Owlboy for example is entirely done in this way
Ohhh interesting. I'll take a look. From what i see right now it mostly takes asset presets and patches them together. Interesting attempt. Definitely gonna study it some more!
Sorry if i wasn't quite clear on what was on my mind. It seems you got the impression i wanted to replace tile grids entirely, which was not my intention. I just wanna discuss if we still need to entirely stick to grids as we do for the past 30 years, or if we can do some things to drive pixelart forwards, as 3d was pushed throughout the years, especially through its technology.
Just think back on it.
We started with colored polygons, then had uv's, later adding vertex lighting, pixel based lighting, bump mapping - being replaced by normal mapping soon after, shadow casting, subsurface scattering, skeletal animations, morph targets for facial animations, physics simulations for cloth and hair, ragdolls. It just inreasingly getting more ridiculous how much stuff is discovered. Allthewhile pixel games today are pretty much the same games that could have been done in the 90s.
Dont get me wrong, there's nothing wrong with that either. I'm just thinking that we could and maybe should also consider that we have more capabilities with todays machines, to leverage what we have and try to think of something we can make of it.