Hi !
I'am thinking a lot of defeat of video/audio tags. Defeat uniformity of video/audio technology/codecs indeed.
IMHO the problem is HTML 5 is too large standard. Nice thing in XHTML was that there was many small standards around XHTML 2. All was some kind of "plugable-metalanguages which was optional. Many smaller standards made some kind of flexibility in this stuff. The wrong way is to made from HTML 5 "Theory of everything". I think in some cases HTML 5 is a martyr of be a "theory of everything". Pain because this standard is too big.
In case of video/audio I think that is good point to think about new, small but extremely flexible standard. Call it MediaML. Goals of this standard is made a small, extensible language witch may include many different existing and becoming media internet technologies. This could be everything. From film to silverlight or flash ... Doesn't matter - this standard could be some kind of abstract layer for media in internet.
Loading everything to one bag called "HTML 5" is painful. There are too many subjects (especially companies) with too many different goals in "internet media improvements" sector to include this in HTML 5. IMHO there are some "weak" sectors in internet and every standard which want to be flexible and modern develop should avoid this sectors.
In this XHTML was incomparable. XHTML is small and rather use external meta languages then try to take new functionnality. The goal of XHTML was simple and make XHTML "master in this thing". There are to many "masters" in media technologies in Internet to try promote next.
Move difficult to agree technologies to different more flexbile standards. Make HTML 5 small as possible. This is the best productive path of development :]