Skip to content

VGM 2.0 suggestions / ideas

orig. title: Multiple AY chips with stereo

Technical discussion about the VGM format, and all the software you need to handle VGM files.

Moderator: Staff

Post by NewRisingSun »

ValleyBell wrote:Instead of encoding the "chip number" into the chip command, we can just define the same chip multiple times.
I'm not sure I understand. If you have two SAA1099 chips (chips 0, 1), one YM3812 (chip 2) and one DAC (chip 3), surely you need to encode the chip number (0-3 in this case) in the command byte?
ValleyBell wrote:Also, I'd like to have at least 3 different delay commands: "wait ll clock", "wait mmll clock" and "wait hhmmll clock".
Ok. Keep the "custom interval"?
ValleyBell wrote:I think that one "master" clock rate for the whole file is enough.
I had this idea about super-optimizing files by using one track per channel and one-byte wait commands thanks to multiple time bases, but the additional complexity certainly isn't worth it, so forget about that. One time base per file.
ValleyBell wrote:A value that stores the size of the whole file would be useful. Especially if you're dealing with gzip compressed data. (i.e. something similar to the current VGM EOF offset)
Well, you can get that by just adding [$10]+[$14]+[$18]+[$1C]+0x20 together after gzreading 32 bytes, so that field would be kind of redundant.
ValleyBell wrote:Virtual Boy VSU and WonderSwan are WSG-type sound chips
Oh, there are probably other errors in the chip table as well at this time. :P
ctr wrote:You will still need datablocks as chips using RAM is still able to rewrite the contents during playback.
You could command a RAM write via "write xxx bytes from the ROM/PCM section, starting at offset xxx, to RAM starting at offset yyy". (I really don't like huge chunks of data in the command stream.)
ctr wrote:A feature that I think will be essential to VGM 2 is the ability to set panning and volume for each channel (output) of a sound chip separately.
Ok, I'll just add another "chip attribute" for that.
Last edited by NewRisingSun on 2017-09-18, 9:25:42, edited 6 times in total.

Post by vampirefrog »

NewRisingSun wrote:Attached. Consider everything a mere suggestion, including the chip types not supported in VGM v1.71. The Notes and Open Questions sections would not be in any final document, of course. I have not yet bothered to think about the DAC stream control stuff, since I have rarely used it myself so far.
You should go ahead and implement a parser for this new format, and a converter from the current format to the new format. Then you can use libvgm to play the new files. You may also benefit from the data in this sheet.

Also, here is some code that might help, it is generated from the above sheet:

Code: Select all

// get the size of a VGM command, or -1 if it runs out of bytes
int vgm_cmd_size(uint8_t *bytes, int remaining_bytes) {
	if(bytes[0] == 0x67) {
		if(remaining_bytes >= 8) {
			int dataLen = 8 + ((bytes[3] << 24) | (bytes[4] << 16) | (bytes[5] << 8) | bytes[6]);
			return (dataLen >= remaining_bytes) ? -1 : dataLen;
		} else return -1;
	} else if((bytes[0] >= 0x62 && bytes[0] <= 0x63) || (bytes[0] >= 0x70 && bytes[0] <= 0x8f)) return 1;
	else if((bytes[0] >= 0x30 && bytes[0] <= 0x3f) || (bytes[0] >= 0x4f && bytes[0] <= 0x50)) return 2;
	else if((bytes[0] >= 0x40 && bytes[0] <= 0x4e) || (bytes[0] >= 0x51 && bytes[0] <= 0x5f) || (bytes[0] >= 0xa0 && bytes[0] <= 0xbf)) return 3;
	else if(bytes[0] >= 0xc0 && bytes[0] <= 0xdf) return 4;
	else if((bytes[0] >= 0x90 && bytes[0] <= 0x91) || (bytes[0] >= 0xe0 && bytes[0] <= 0xff)) return 5;
	else switch(bytes[0]) {
		case 0x66: return 1;
		case 0x94: return 2;
		case 0x61: return 3;
		case 0x95: return 5;
		case 0x92: return 6;
		case 0x93: return 11;
		case 0x68: return 12;
	}

	return -1;
}

// gets the number of bytes until the Nth command (N=skip)
// if there is an error (it runs out of data), it returns one's complement of the number of bytes that it managed to skip until the error
int vgm_skip_cmds(uint8_t *bytes, int remaining_bytes, int skip) {
	if(remaining_bytes == 0) return -1;
	if(skip == 0) return 0;

	int cur = 0;
	uint8_t *p = bytes;
	while(cur < remaining_bytes) {
		int len = vgm_cmd_size(p, remaining_bytes - cur);
		if(len < 0) return ~cur;
		cur += len;
		p = bytes + cur;
		skip--;
		if(skip == 0) return cur;
	}

	return ~cur;
}

// get the number of samples to wait from a command
// returns -1 if it runs out of data
int vgm_cmd_get_wait(uint8_t *bytes, int remaining_bytes) {
	if(bytes[0] >= 0x70 && bytes[0] <= 0x7f) return bytes[0] - 0x6f;
	else if(bytes[0] >= 0x80 && bytes[0] <= 0x8f) return bytes[0] - 0x80;
	else switch(bytes[0]) {
		case 0x61:
			if(remaining_bytes >= 3) {
				return (bytes[1] << 8) | bytes[2];
			} else return -1;
			break;
		case 0x62:
			return 735;
		case 0x63:
			return 882;
	}
	return 0;
}

// gets the number of bytes to skip after counting N samples (N=skip)
// fills *left with the remaining number of samples
// returns one's complement if it runs out of data
int vgm_skip_samples(uint8_t *bytes, int remaining_bytes, int skip, int *left) {
	if(remaining_bytes == 0) return -1;

	int cur = 0, totalSamples = 0;
	uint8_t *p = bytes;
	while(cur < remaining_bytes) {
		int len = vgm_cmd_size(p, remaining_bytes - cur);
		if(len < 0) return -cur;
		int samples = vgm_cmd_get_wait(p, remaining_bytes - cur);
		totalSamples += samples;
		if(totalSamples >= skip) {
			if(left) *left = totalSamples - skip;
			return cur;
		}
		cur += len;
		p = bytes + cur;
	}
	return -cur;
}

// check if a command is valid
int vgm_cmd_valid(uint8_t c) {
	return vgm_cmd_bytes(c) >= 0;
}

// get the chip ID for a command
// does not check data blocks
int vgm_cmd_get_chip(uint8_t *bytes, int length) {
	if((bytes[0] >= 0x80 && bytes[0] <= 0x8F)) return VGM_CHIP_YM2612;
	else switch(bytes[0]) {
		case 0x4F:
		case 0x3F:
		case 0x50:
		case 0x30:
			return VGM_CHIP_SN76489;
		case 0x51:
		case 0xA1:
			return VGM_CHIP_YM2413;
		case 0x52:
		case 0xA2:
		case 0x53:
		case 0xA3:
		case 0xE0:
			return VGM_CHIP_YM2612;
		case 0x54:
		case 0xA4:
			return VGM_CHIP_YM2151;
		case 0x55:
		case 0xA5:
			return VGM_CHIP_YM2203;
		case 0x56:
		case 0xA6:
		case 0x57:
		case 0xA7:
			return VGM_CHIP_YM2608;
		case 0x58:
		case 0xA8:
		case 0x59:
		case 0xA9:
			return VGM_CHIP_YM2610;
		case 0x5A:
		case 0xAA:
			return VGM_CHIP_YM3812;
		case 0x5B:
		case 0xAB:
			return VGM_CHIP_YM3526;
		case 0x5C:
		case 0xAC:
			return VGM_CHIP_Y8950;
		case 0x5D:
		case 0xAD:
			return VGM_CHIP_YMZ280B;
		case 0x5E:
		case 0xAE:
		case 0x5F:
		case 0xAF:
			return VGM_CHIP_YMF262;
		case 0xA0: return VGM_CHIP_AY8910;
		case 0xB0:
		case 0xC1:
			return VGM_CHIP_RF5C68;
		case 0xB1:
		case 0xC2:
			return VGM_CHIP_RF5C164;
		case 0xB2: return VGM_CHIP_PWM;
		case 0xB3: return VGM_CHIP_GAMEBOY_DMG;
		case 0xB4: return VGM_CHIP_NES_APU;
		case 0xB5:
		case 0xC3:
			return VGM_CHIP_MULTIPCM;
		case 0xB6: return VGM_CHIP_UPD7759;
		case 0xB7: return VGM_CHIP_OKIM6258;
		case 0xB8: return VGM_CHIP_OKIM6295;
		case 0xB9: return VGM_CHIP_HUC6280;
		case 0xBA: return VGM_CHIP_K053260;
		case 0xBB: return VGM_CHIP_POKEY;
		case 0xBC:
		case 0xC6:
			return VGM_CHIP_WONDERSWAN;
		case 0xBD: return VGM_CHIP_SAA1099;
		case 0xBE:
		case 0xD6:
			return VGM_CHIP_ES5506;
		case 0xBF: return VGM_CHIP_GA20;
		case 0xC0: return VGM_CHIP_SEGA_PCM;
		case 0xC4: return VGM_CHIP_QSOUND;
		case 0xC5: return VGM_CHIP_SCSP;
		case 0xC7: return VGM_CHIP_VSU;
		case 0xC8: return VGM_CHIP_X1_010;
		case 0xD0: return VGM_CHIP_YMF278B;
		case 0xD1: return VGM_CHIP_YMF271;
		case 0xD2: return VGM_CHIP_K051649;
		case 0xD3: return VGM_CHIP_K054539;
		case 0xD4: return VGM_CHIP_C140;
		case 0xD5: return VGM_CHIP_ES5503;
		case 0xE1: return VGM_CHIP_C352;
	}
	return VGM_CHIP_NONE;
}
Example use:

Code: Select all

int main(int argc, char **argv) {
	uint8_t bytes[] = {
		0x61, 0x03, 0x04,
		0x52, 0x01, 0x02,
		0x53, 0x03, 0x04,

		0x52, 0x05, 0x06,
		0x53, 0x07, 0x08,
	};

	int s = vgm_skip_cmds(bytes, sizeof(bytes), 5);
	printf("%d\n", vgm_cmd_get_wait(bytes, sizeof(bytes)));

	return 0;
}
  • User avatar
  • ValleyBell Offline
  • Posts: 4768
  • Joined: 2011-12-01, 20:20:07
  • Location: Germany

Post by ValleyBell »

NewRisingSun wrote:
ValleyBell wrote:Instead of encoding the "chip number" into the chip command, we can just define the same chip multiple times.
I'm not sure I understand. If you have two SAA1099 chips (chips 0, 1), one YM3812 (chip 2) and one DAC (chip 3), surely you need to encode the chip number (0-3 in this case) in the command byte?
Nevermind. I misunderstood how it is supposed to work before.
NewRisingSun wrote:
ValleyBell wrote:Also, I'd like to have at least 3 different delay commands: "wait ll clock", "wait mmll clock" and "wait hhmmll clock".
Ok. Keep the "custom interval"?
Feel free to add/keep additional stuff.

Post by NewRisingSun »

Here's a slightly updated version, incorporating the previous comments. I still need to write up the ROM/PCM data section and the DAC stream control stuff. What do you guys think about GD3 versus Vorbis comments?
Attachments
vgmspec200-2017-09-18-NRS.txt
(17.55 KiB) Downloaded 339 times
  • User avatar
  • grauw Offline
  • Posts: 150
  • Joined: 2015-02-22, 3:40:22

Post by grauw »

Interesting, I’ll comment on it from VGM player for MSX perspective soon :).

However at least one quick comment:
1 K051649 (SCC1) 1500000/1 pp aa dd
Please make this just “aa dd”, the current encoding is very yuck. The SCC and SCC+ simply have 256 registers (with some slight differences between the two).
  • User avatar
  • grauw Offline
  • Posts: 150
  • Joined: 2015-02-22, 3:40:22

Post by grauw »

Ok, exchanging some sleep for some comments… :)

Firstly, there is a bunch of things I like, such as the new track command with the chip index in it, I won’t touch on these too much. Also, please take the below as constructive criticism, because that is how it is intended :D.

1. Please put GD3 data at the start, currently I have to load and decompress the entire file before I can display the song data, rather than just having a quick load when the user selects a file or views the directory index or playlist. Also I would like meta-information like total length in front.

2. Please put the ROM/PCM block before the track data. I don’t want to decompress the entire VGM first before starting playback. Background (on the fly) decompression is high on my wishlist for VGMPlay-MSX, this will drastically reduce load times and memory requirement.

(So this order: metadata, chips, rom data, tracks.)

3. Multiple tracks, are they supposed to play back one by one or simultaneously? In the former case, why not have different VGMs then. In the latter case, this will be difficult to combine with background loading but that could be OK if it’s not used often, but I don’t understand the exact purpose of this? If there is any way it could be interleaved, it would have my preference.

4. Specifying the frequency as numerator / denumerator is nice in theory, however I think it unnecessarily complicates things. By their nature all clock chips have deviations (even over time), and thus none will hit the exact number. The only practical application I can imagine is to encode low frequencies like 59.94 Hz.

5. As mentioned before please encode SCC commands as “aa dd”.

6. Why are wavetable chip command addresses not encoded in intel byte order (ll mm)?

7. For me the point for a type / subtype would be that if a new subtype is added, there is a likelihood that it will be recognised by older players which have not added support for it and will still play back at least to some degree.

7a. In that vein, it is strange to me that within a type there are different command lengths.

7b. Also the OPLL does not belong in the OPL group since it is not register compatible with the others.

8. I feel for parsing simplicity, rather than FF it would be better to use $7F for special commands, then the 16th chip can be used with data length 8. Not a big fan of the MIDI style variable length, a length byte or encoded length byte would be easier to process, and not restrict the following values to 7 bits and the (slow on Z80) bit-shifting encoding schemes for >127 values that will undoubtedly follow.

9. Additionally, maybe leave a few numbers free for future extension? $70-7A? Could define some fixed lengths for some / all of them for downwards-compatibility.

9a. Instead of 7C-7F and the “special command”, maybe introduce a special “VGM control chip” and reuse the chip command? Seems like it would add a lot of flexibility. 00-7F could be all waits and 80-FF all commands.

9b. Channel remaps could be specified as channel swaps to fit within the 8 data bytes and reduce the amount of state.

9c. Another idea, the wait commands could use an UTF-8 like encoding where the two most significant bits indicate the length (0-1 byte, 1-2 bytes, 2-3 bytes, 3-4 bytes, big endian order.)

10. Currently my timing code is 16-bit, changing to 32-bit and a much higher resolution would be a bit more annoying to process, not a huge fan but I guess I’ll manage.

11. Please put the command data byte count in bits 0-2 for easy parsing by masking.

12. Channel remap will be very annoying to implement, and not all chips even define a channel that clearly, e.g. think of the AY3 tone channels shared with the single the noise. Also not all channels are equal, e.g. consider the YM2612 6th channel with DAC, or the YM2151’s 8th channel with drums. I think it would be way better if this could somehow be handled on the VGM encoder’s side (no concrete suggestions right now, maybe something with chips / tracks?). Or at least fail relatively gracefully (e.g. as much as it does now) if it is not implemented.

13. For future extensibility (upwards compatibility), I think it might be good to use a chunks-like architecture of sorts. Currently I think if a chip attribute is ever added, the file can’t be parsed by older players. E.g. in the chips table, prefixing each chip by a length byte would be a good start.

14. Combine the chip type and subtype into one 16-bit value.

15. A common structure for all chip types would be easier to parse than the current ID-value based approach. At least the current attributes seem common to all.

16. Stream commands are just another generic DMA chip as far as I’m concerned. Could be the same as the earlier mentioned VGM control chip.

17. It would be good if there was support for multiple data blocks, because some I would upload to a chip’s sample memory and could discard after, while others I would want to keep in memory (e.g. CPU-controlled DAC PCM data), but as it is proposed I think I can’t know this in advance, whereas I can with VGM 1.0.

18. Alternatively / additionally it would be really handy for me if there was some index of start / end / loop positions of blocks are used by stream commands and PCM chips which do not have these as preset instrument data in the ROM. Maybe this could replace the multiple data blocks thing. Saves me from having to pre-scan the entire VGM trying to determine these when emulating PCM on OPL4. It could also allow to shorten special command $03 and the stream commands…

19. Finally, although I do see the flaws in the current VGM format, I already do not have that much free memory in VGMPlay-MSX, so it’ll be a bit of a challenge to put the code for two formats in one executable. In more general terms, there’s currently quite a huge library of VGM players and hardware, all of which will not support the new format, and may not for the coming years or at all. The benefit needs to be great enough.

Post by NewRisingSun »

Couple of early answers, more maybe to come later...
grauw wrote:Please put GD3 data at the start
That's been an issue with MP3s between ID3v1 and ID3v2. At this point, we haven't even discussed whether to keep GD3, or switch to Vorbis comments, which I would prefer.
grauw wrote:Please put the ROM/PCM block before the track data
Can be done. It will not make a difference for players that decompress the entire file anyway.
grauw wrote:Multiple tracks, are they supposed to play back one by one or simultaneously? In the former case, why not have different VGMs then.
Simultaneously, for files with channel-specific loop points. Having to play different VGMs simultaneously would mean that one song is no longer represented by one file.
grauw wrote:Specifying the frequency as numerator / denumerator is nice in theory, however I think it unnecessarily complicates things.
It should be pretty simple to derive integer "counter add" and "tick period" values given a player's master timer clock and a file's numerator/denominator values. Maybe I'll add some code on that issue. It will require 32-bit division though in any case, but I would not know how to avoid that.
grauw wrote:As mentioned before please encode SCC commands as "aa dd"
Can do, after I research why it's currently listed as pp aa dd.
grauw wrote:Why are wavetable chip command addresses not encoded in intel byte order (ll mm)?
I've just copied it from the current specification.
grauw wrote:For me the point for a type / subtype would be that if a new subtype is added, there is a likelihood that it will be recognised by older players which have not added support for it and will still play back at least to some degree. 7a. In that vein, it is strange to me that within a type there are different command lengths. 7b. Also the OPLL does not belong in the OPL group since it is not register compatible with the others.
Well, my criterion is more about common emulation cores than compatibility with players which don't know a particular subtype.
grauw wrote:I feel for parsing simplicity, rather than FF it would be better to use $7F for special commands, then the 16th chip can be used with data length 8.
Oh, ok.
grauw wrote: Not a big fan of the MIDI style variable length, a length byte or encoded length byte would be easier to process, and not restrict the following values to 7 bits and the (slow on Z80) bit-shifting encoding schemes for >127 values that will undoubtedly follow.
If I can avoid putting data blocks into the code, then the length byte is unlikely to ever get beyond 127 bytes anyway.
grauw wrote: Additionally, maybe leave a few numbers free for future extension? $70-7A? Could define some fixed lengths for some / all of them for downwards-compatibility.
I think I'll just drop the fixed-length bytes above 16/17.
grauw wrote:Instead of 7C-7F and the “special command”, maybe introduce a special “VGM control chip” and reuse the chip command? Seems like it would add a lot of flexibility. 00-7F could be all waits and 80-FF all commands.
That would reduce the number of usable chips by one, though. Then again, who uses even 15 chips?
grauw wrote:Channel remaps could be specified as channel swaps to fit within the 8 data bytes and reduce the amount of state.
Mmmh. I'm not sure I would want to give up the flexibility of variable length bytes beyond a length of 8 though.
grauw wrote:Another idea, the wait commands could use an UTF-8 like encoding where the two most significant bits indicate the length (0-1 byte, 1-2 bytes, 2-3 bytes, 3-4 bytes, big endian order.
That would result in a wait command consisting of 6 data bits per byte. Not sure whether others would like that.
grauw wrote:Please put the command data byte count in bits 0-2 for easy parsing by masking.
So basically swapping the nibbles? Sure, I can do that. That would be a switch from your UTF-8-like proposal for the wait commands though.
grauw wrote:Channel remap will be very annoying to implement, and not all chips even define a channel that clearly,
I don't think it would ever be needed on the AY chip anyway. The feature is such that without remapping, the file will still play, but with artifacts.
grauw wrote:For future extensibility (upwards compatibility), I think it might be good to use a chunks-like architecture of sorts. Currently I think if a chip attribute is ever added, the file can’t be parsed by older players. E.g. in the chips table, prefixing each chip by a length byte would be a good start.
If no attribute ever has more than eight bytes, and we don't need more than 16 attribute types, I could use three bits of the attribute number as a length count similar to how the track commands work.
grauw wrote:Combine the chip type and subtype into one 16-bit value.
Maybe. :)
grauw wrote:A common structure for all chip types would be easier to parse than the current ID-value based approach. At least the current attributes seem common to all.
A file should not have to specify fields that use the "default" values in my opinion, especially if we are going to add channel-specific volumes, which of course will vary between chips in size anyway.
grauw wrote:Stream commands are just another generic DMA chip as far as I’m concerned. Could be the same as the earlier mentioned VGM control chip.
That's a good idea.
grauw wrote:It would be good if there was support for multiple data blocks,
I should make it clearer that the ROM/PCM section doesn't represent ONE data block, but ALL data blocks.
  • ctr Offline
  • Posts: 492
  • Joined: 2013-07-17, 23:32:39

Post by ctr »

Not having the datablocks in the command stream itself might present problems for VGM logging, especially those chips that use RAM. It would make it necessary to store the entire file or the datablock section in a memory buffer instead of writing directly to the file, as the size of the datablock may change while logging.

Also, I think at some places we have to make a choice between making parsing easier on 30 year old computers vs leaving more room for expansion.

Also I suggest merging most of the chip attributes. An example:

Code: Select all

00+a bbcc ddeeffgg hh ii - new chip
 where a = Chip id (00-0f),
 bc = chip type/subtype,
 defg = clock
 h = Channel mask (bit 0 set, use channel 0 etc).
 i = global volume (affect all channels)
80 04 aa bbcc dd = Set channel volume & panning
 04 = length
 aa = chip id,
 bc = Volume (in 8.8 fixed point)
 d = panning (signed 8 bit)
Any extra commands would have a format like this 
81 <length> <commands>
ff - End of chip attributes
As parsing command 00-0f will be required, it can be assumed that players do not need to know the length of this command. All optional commands will be at 80-7f instead and include a length byte.

I use a 32 bit unsigned int for clock. Even if we used a clock/divider setting, it would not work well because most of our logs are done by MAME. At some places we have to do things the MAME way, and MAME compiles all the final clocks with divider already applied (they're only present in the source code).

Post by NewRisingSun »

Under no circumstances will I ever accept doing things in a suboptimal way simply because MAME does them that way. Ever. Yes, I'm kind of militant about that. MAME is not that great an emulator, in terms of its architecture at least, to justify making it an authority, even as MAMEDev act as if it were (hence my contempt for them). MAME would be able to log with the rounded chip clock in the numerator and 1 in the denominator, and a VGM2 utility would replace those with the exact fraction based on a conversion table.

You are right that a VGM2 logger would have to maintain a separate memory region, but I'm not sure if they don't already do that. DOSBox-VGM certainly does (which is why I could, and should, modify it to put the complete data bank at the beginning of the file instead of mid-stream).

Replacing my uint32 volumes with 16-bit 8.8 fixed point float seems ok, but I still prefer left and right volume instead of volume plus panning, because panning could, at least in theory, be applied in different ways. And I would not want to force a volume setting where it isn't needed, such as when having just a single chip, which is why I don't like the attribute-merging idea that much.
  • User avatar
  • grauw Offline
  • Posts: 150
  • Joined: 2015-02-22, 3:40:22

Post by grauw »

Thanks for your responses! I’ll reply later but one thing...
Under no circumstances will I ever accept doing things in a suboptimal way simply because MAME does them that way.
This is the reason why the SCC is logged as it is :D. It reflects an emulation core implementation, which implements both SCC and SCC+, and isn’t driven by VGM at the I/O level but one level deeper. By the reasoning of the current implementation, the EG and frequency values should also be separated for FM chip commands. For me, it is just extra bytes and decoding hassle (if / then / if / then / if / then, rather than the asm equivalent of *(9800h + reg) = value).
  • User avatar
  • ValleyBell Offline
  • Posts: 4768
  • Joined: 2011-12-01, 20:20:07
  • Location: Germany

Post by ValleyBell »

NewRisingSun wrote:
grauw wrote:Why are wavetable chip command addresses not encoded in intel byte order (ll mm)?
I've just copied it from the current specification.
I'd actually prefer to encode all command addresses as Big Endian (mm ll) because that is consistent with the port/register/data encoding used for Yamaha FM chips.
ctr wrote:Not having the datablocks in the command stream itself might present problems for VGM logging, especially those chips that use RAM. It would make it necessary to store the entire file or the datablock section in a memory buffer instead of writing directly to the file, as the size of the datablock may change while logging.
NewRisingSun wrote:You are right that a VGM2 logger would have to maintain a separate memory region, but I'm not sure if they don't already do that. DOSBox-VGM certainly does (which is why I could, and should, modify it to put the complete data bank at the beginning of the file instead of mid-stream).
The problem is that you either need to know what the data blocks need to contain in advance or keep the whole VGM in memory until you stop recording, because you need to analyze and resort data when writing the VGM.
And especially when logging certain Arcade machines, you can get a log of 100 MB within few minutes, which is why I'd prefer to not keep the whole thing in memory.
ctr wrote:I use a 32 bit unsigned int for clock. Even if we used a clock/divider setting, it would not work well because most of our logs are done by MAME. At some places we have to do things the MAME way, and MAME compiles all the final clocks with divider already applied (they're only present in the source code).
Additionally, all of our emulation cores use uint32 for both, clock AND sample rate. So a clock/divider setting would have no benefit except for additional file format complexity.

About the chip table: I would prefer having a "number of used sound chips" counter. I'd suggest this layout, which I think is pretty flexible and easy to parse:

Code: Select all

aa - number of chips
[repeat aa times]
  aabb ccddeeff - new chip (chip ID implied by number of number of previously defined chips)
   ab = chip type/subtype,
   fedc = clock
  [repeat until EOA]
    aa bb ... - set data for value aa (bb = data length)
     00 01 aa - flags (1 byte)
     00 02 aa - flags (2 bytes)
     01 04 aabbccdd = global volume (16.16 fixed point, Little Endian)
     02 02 aabb - set panning (int16 - maybe with a range of ±16384 or so)
     03 08 aabbccdd eeffgghh - set L/R volume (two values, one for left and right speaker each, fixed 16.16 signed)
     ...
     80-FE aa ... - chip-specific stuff (e.g. SN764xx noise parameters, ES5503 output channels)
  [end]
  FF - End Of chip Attributes (EOA)
[end]
Panning and L/R volume would override each other.


grauw: The reason SCC is logged in this weird manner is, that I just took the emulation and put VGM logging instructions there without knowing how the chip works. If I would do SCC logging again, I'd surely not do it this way. But you can't change what happened. The SCC also seems to be one of very few chips where MAME handles writes in such a weird way. Usually the sound core does the address -> function translation by itself. I dunno why they didn't do that for the SCC.

Post by NewRisingSun »

The numerator/denominator fields for the time base were in that S98 specification, so that's why I went with that. I think it's a good way of avoiding rounding errors if one wants to avoid using floats. I think uint32 could be acceptable for chip clocks, because the error would be within the defined tolerances. For example, SMPTE 170M states a tolerance of +/- 10 Hz for the NTSC 3579545.4545... subcarrier frequency. It's a different story for the file's overall time base. If the time base is an integer such as 44100, we have no problem. But if it is supposed to contain the logged system's master, CPU or timer clock (1.79 MHz for the NES CPU, 1.19 MHz for the PC's timer chip), then an uint32 value would yield rounding errors accumulating over time. So, I would keep the quotient for the time base but moving to uint32 for chip clocks.

I still don't understand why have a panning attribute at all when you already have left/right volumes. Especially when one is supposed to override the other, it just creates a potential for contradictory file data.

So would you all like to drop the ROM/PCM section completely and keep the old data blocks, keep both and use data blocks during logging and move them to the ROM/PCM section by a cleanup utility, or drop the data blocks as in the current proposal, requiring arcade emulators to keep 100+ MB data in memory during logging (which I don't see as problematic given todays memory sizes)?
  • User avatar
  • grauw Offline
  • Posts: 150
  • Joined: 2015-02-22, 3:40:22

Post by grauw »

NewRisingSun wrote:
grauw wrote:Please put GD3 data at the start
That's been an issue with MP3s between ID3v1 and ID3v2. At this point, we haven't even discussed whether to keep GD3, or switch to Vorbis comments, which I would prefer.
I think the only reason to put them at the end is for quick retagging, something which was perhaps an issue in the early days of MP3, but today…

No opinion on GD3 vs. Vorbis atm. GD3 is not very extensible and I’d much prefer UTF-8 over UTF-16, but I haven’t read up about Vorbis.
NewRisingSun wrote:
grauw wrote:Specifying the frequency as numerator / denumerator is nice in theory, however I think it unnecessarily complicates things.
It should be pretty simple to derive integer "counter add" and "tick period" values given a player's master timer clock and a file's numerator/denominator values. Maybe I'll add some code on that issue. It will require 32-bit division though in any case, but I would not know how to avoid that.
Not too worried about the computational or code size expense, it’s nothing compared to gzip decompression. And as you say I’ll need a 32-bit division routine anyway to deal with the variable tick rate. This remark was more from a simplicity POV. (Iow. no needless complication for mathematical perfection without a clear practical benefit. But you elaborated later I see.)
NewRisingSun wrote:
grauw wrote:For me the point for a type / subtype would be that if a new subtype is added, there is a likelihood that it will be recognised by older players which have not added support for it and will still play back at least to some degree. 7a. In that vein, it is strange to me that within a type there are different command lengths. 7b. Also the OPLL does not belong in the OPL group since it is not register compatible with the others.
Well, my criterion is more about common emulation cores than compatibility with players which don't know a particular subtype.
A bit weak criterion IMO… :D We can do better! Otherwise forget about the grouping, just have an opaque 8 or 16-bit ID.

For me what would be worth capturing is that, currently AY-3-8910 and Yamaha YM2149 are two completely compatible subtypes. Ditto for SN76489 variants. If a player for AY-3-8910 is released today, and two years from now you guys add say, a YM2149C because it has a higher envelope resolution or different noise colour, and start logging a bunch of music for it. When that happens, the player should just be able to play these without requiring an update. In VGM 1.0, this was the case, since it would just be specified as a flag which it would ignore.
NewRisingSun wrote:
grauw wrote: Not a big fan of the MIDI style variable length, a length byte or encoded length byte would be easier to process, and not restrict the following values to 7 bits and the (slow on Z80) bit-shifting encoding schemes for >127 values that will undoubtedly follow.
If I can avoid putting data blocks into the code, then the length byte is unlikely to ever get beyond 127 bytes anyway.
With 127 I was referring to the 7-bit value range (by using 8th bit as a terminator bit); in MIDI there are plenty of cases cases where e.g. a 32-bit value is encoded in 5 bytes which need to be bit-shift-glued together, that’s cumbersome and ugly, and I would much prefer a length byte (especially if you say it is unlikely to exceed 256 bytes).
NewRisingSun wrote:
grauw wrote: Additionally, maybe leave a few numbers free for future extension? $70-7A? Could define some fixed lengths for some / all of them for downwards-compatibility.
I think I'll just drop the fixed-length bytes above 16/17.
What do you mean by 16/17? Anyway it was not super important or necessary, just a bit of a safety net just in case. But if a “VGM player control” chip is implemented then there’s plenty of safety net there, just takes a one more byte (note: the length bits can be used to our advantage and makes it automatically forward-compatible).
NewRisingSun wrote:
grauw wrote:Channel remap will be very annoying to implement, and not all chips even define a channel that clearly,
I don't think it would ever be needed on the AY chip anyway. The feature is such that without remapping, the file will still play, but with artifacts.
If that’s the case, and it’s not going to be too common, I think that would be an acceptable fallback.
NewRisingSun wrote:
grauw wrote:For future extensibility (upwards compatibility), I think it might be good to use a chunks-like architecture of sorts. Currently I think if a chip attribute is ever added, the file can’t be parsed by older players. E.g. in the chips table, prefixing each chip by a length byte would be a good start.
If no attribute ever has more than eight bytes, and we don't need more than 16 attribute types, I could use three bits of the attribute number as a length count similar to how the track commands work.
For the track commands there is a size minification requirement, so there bunching everything together into one byte adds value. However for headers which only occur once, it just complicates the parsing.
NewRisingSun wrote:
grauw wrote:Combine the chip type and subtype into one 16-bit value.
Maybe. :)
grauw wrote:A common structure for all chip types would be easier to parse than the current ID-value based approach. At least the current attributes seem common to all.
A file should not have to specify fields that use the "default" values in my opinion, especially if we are going to add channel-specific volumes, which of course will vary between chips in size anyway.
Let’s put it like this: I need to know the chip type AND subtype to know how to process the rest of the chip attributes. This key-value attribute structure suggests that the ordering can vary, which would be troublesome. Since the attributes are of variable length I would need an oversized buffer, overflow handling, and loop over the data at worst thrice.

Sure you can spec that they need to appear in a certain order, but it would be better to have a more rigid data structure which enforces it, with room for extension at the end, either with optional key-length-value fields or just a structure length field. I would prefer if these optional fields would contain purely ancillary information, and things that are common between all chips to be included in the preceding structure. I think VB’s suggestion captures that.
NewRisingSun wrote:
grauw wrote:It would be good if there was support for multiple data blocks,
I should make it clearer that the ROM/PCM section doesn't represent ONE data block, but ALL data blocks.
Ah, I note the (to be written) :D. Ok, I’ve given some of my wishes, I will wait and see :).
ValleyBell wrote:Panning and L/R volume would override each other.
Smells like a compromise :wink:.

Post by vampirefrog »

Might I suggest that having multiple songs in one file is not needed for a few reasons:

1. The filesize saved due to common data blocks is not that big, most VGMs are an order of magnitude smaller than their corresponding mp3's, and if we're going to throw the "today's day and age" argument around, well, there you go
2. Adds complexity to the vgm loader/parser library
3. Might interfere with audio players that expect 1 file = 1 song

Also, we could do it the way MDX files handle PCM samples, by storing them in an external PDX file (for example you can have 10 MDX files that use the same PDX file for samples), but that also isn't a good idea for a new format, because people might lose the external file when moving files around.

I believe 1 file = 1 song would make things easier for everyone. The only "victim" here is the hard disk, which can easily take it.

--------

I believe the panning stuff needs more research. For examples, some chips have a lowpass filter on them, we might want to support that in the VGM file. Per chip panning, per channel volume, stereo volume, filtering, other DSP effects might be interesting. Perhaps have a 'DSP' header that does all that, even if in most cases it will only do panning. This would add complexity but it might be nice.

--------

Also, my vote is for keeping data block commands, for ease of use. I think it works fine and has its own flexibility. VGM players so far have handled it fine, no worries. I know it might be a nice optimization to allocate all memory at the beginning, before playback starts, but I think you're over-optimizing in this case.

--------

My vote is for tags at beginning of file as well. And as far as tags go, here's how I see it: it's just supposed to give you a minimal amount of info when playing the file, it's not meant to be a data store. What I mean by that is, if you want thorough information on one song or one pack, that should be in the site database or in the text file, not in the individual music files. I think the fixed fields in GD3 right now are fine.

Post by NewRisingSun »

Here is an updated draft for commenting, now carrying the version number 0.01. I tried to incorporate as many of your previous comments as possible. Note the changelog at the end of the file.

I still think specifying overall chip and individual chip channel volumes as signed left+right values is better than separate volume and panning attributes (and especially better than two attributes cancelling each other out), as it neatly incorporates volume, panning and polarity.

Metadata, memory write and stream control sections are now written; the only major thing missing is the compression section. There is now only one memory write command which necessarily references a previously-defined data block or the Memory Data section/library file, so no in-stream "write the following 50,000 bytes to PCM RAM". The philosophy behind this is that raw data belongs in the Memory Data section, not in-stream; the data block command is only there to facilitate logging and should not be used in a trimmed and cleaned-up file (although that will be ultimately decided by whoever approves packs).
Attachments
vgmspec200-v0.01.txt
(34.01 KiB) Downloaded 328 times
Post Reply