Some comments;
I don’t think embedding a file name in the data is good, it won’t survive file renames, or 8.3 truncation as happens on MSX.$6 uint16 File flags
Bit 0: 1=.VGM2 file uses external Memory Data Library
file (see Section 3). The VGM2 file's Memory
Data section only contains the name of the
Memory Data Library file.
I still think it would be better to not have this feature at all, at least not in the first version (can be added later if needed). It just adds complications and complicated user interaction (like a manual “select data lib” prompt) for a gain that is doubtful.
Additionally, I think this flag should not be in the header, but in the memory data section.
I think the chip info section should come before the memory data, so that:$10 uint32 Size of Metadata section
$14 uint32 Size of Memory Data section
$18 uint32 Size of Chip Info section
$1C uint32 Size of Track Data section
1. I can display chip info along with the metadata without first having to load and decompress the potentially huge data section.
2. I can process the data section with knowledge of what chips it is used for.
I think it would be good if the header contained a header length value, for future extension. Otherwise, I can foresee that the metadata would be abused for non-metadata purposes, because it is the only place to compatibly add global extra information.$20 Total VGM header size
Why not simply a sequence of key-length-values? key: byte, length: word or doubleword, value: UTF-8 string. It would be easier to parse, since it would just need to do a byte comparison rather than a string comparison. It also meshes better with how chips are specified. There is no particular value in Vorbis reuse here imo.The metadata contains a number of "Vorbis comments" [2] attributes.
I like that the data is all guaranteed to be up-front, if I’m doing on-the-fly decompression it’s good to be assured that there is not going to be any huge data block in the middle of the stream that will add a pause in the playback.The precise meaning of the bytes in the Memory Data section therefore depends entirely on the memory write and stream control commands in the Track Data.
However as I mentioned earlier, I would prefer this data to be less opaque, more structured. I understand that it complicates the spec, however it would open up many possibilities for me to preload and convert the data without having to do a pre-processing pass on the track data. Processing it in real-time would add pauses to the playback, and effectively prevents me from using e.g. the OPL4 to play PCM data. The way this worked in VGM 1.0 was more suitable for my purposes, and I would rather that be expanded than simplified. I’ll write a little proposal later.
Ok, now I think it’s a bit weird that the header specifies the clock in numerator / denumerator format and the chips do not . I would say pick either the one or the other. I’ve had my say about which I think is simpler, but really, both are fine for me.$2 uint32 Chip clock
I wouldn’t specify these explicitly, some implementations could interpret this as that the length will always be 1, 2 or 4, while I think a length of 3 and 5 or more would be equally valid.needs, can be uint8, uint16 or uint32, as denoted by
Low-pass filter has purpose, since this is usually present in the sound circuitry. However, I’ve never heard about a high pass filter in the signal path?$11 uint8 order, Global low-pass filter of nth order with specified
uint32 cutoff -3 dB cutoff frequency in Hz. Default is no filter.
$12 uint8 order, Global high-pass filter of nth order with specified
uint32 cutoff -3 dB cutoff frequency in Hz. Default is no filter.
Btw, on e.g. the OPLL in MSX there is a chain of first-order low pass filters, each at different cut-off frequencies (also varying per machine actually), so this would be an approximation. I think it’s good enough though.
I also note that you have per-channel versions of these commands as well. Maybe have one version, and use a channel no. of FFH to indicate global? Maybe would reduce the amount duplication, esp. if more of these would get added.
Typical clock is 3579545 for both SCC and SCC+ btw.0 K005289 (Bubble System) 3579545 ???
1 K051649 (SCC) 1500000 aa dd
2 K052539 (SCC+) ??? aa dd
MoonSound is just the name given by Sunrise to their OPL4 cartridge, and Yamaha’s name OPL4 is equally well known in the MSX community, so I don’t think it is explicitly worth mentioning the former .2 Yamaha YMF278B (OPL4, MoonSound) 33868800 pp aa dd
Would it maybe be worth defining 2-byte commands for all the OPN* chips, which write to port 0? Could save a bunch of bytes.$44 Yamaha OPN family, subtypes:
0 YM2203 (OPN) 3000000 aa dd
1 YM2608 (OPNA) 8000000 pp aa dd
I assume this means that if 4 bytes are passed in, it is interpreted as float? Could maybe phrase that case a bit more explicitly.Can take 8-bit, 16-bit, 24-bit, or 32-bit float samples as data.
Why are partial remaps not allowed?$10 (varies) Remap chip channels, used for files logged from sound
drivers that allocate channels dynamically.
First data byte: chip number (1-15). Further bytes:
remapped channel numbers. All channels of the chip must
be specified; no "partial remaps" are allowed.
Alternate option: $10 chipid aa bb Redirect commands for channel A to channel B and vice versa.
With this one would specify a series of swap commands rather than one big map. Advantage would be that the length is no longer variable, and processing is a bit easier (each command just swaps two pointers, rather than more complex processing of an inner data structure).
I think eventually this should get further precise specification what this means for every chip, how the channel numbers are assigned (e.g. for those with FM + PCM) and which registers are redirected how, but for now we can leave it be since that would have to come from practical implementation experience. Could be an appendix (possibly separate document) published later, or it could be specified in reference source code form (meh).
I don’t think the short forms are needed. I don’t really see how they would be useful, and it just complicates the implementation for a few bytes saved for (hopefully infrequently used) commands. Also, if “from” and “to” retain their previous values, is this their value with length added, or not?$21 uint8 chip, Short forms of command $21; distinguished by the "Number
uint32 to, of data bytes" field. "from" (and "to" in the shortest
uint32 len/ form) retain the values they had after the previous $21
uint8 chip, command was executed. The short forms may appear without
uint32 len a preceding long-form command; "from" and "to" are
initialized to zero when playback starts from the
beginning of the file. "from" and "to" are maintained
per-chip, not per-track nor per-file.
Meh… Just extra playback overhead for me, while the files are already compressed by gzip…"cctt": compression type word, decompressed by the VGM2
player, followed by compression-type-specific data bytes
("varies").
Why have this command? The ticks version ($32) is better. This one has bad precision in the low Hz range, and just requires me to do an unnecessary 16/32-bit division to determine the period that $32 would provide me directly.$31 uint8 chip, Set playback rate in Hz for stream playback on chip
uint16/32 rate "chip". The command accepts either 3 or 5 data bytes,