This seems straight forward and already well documented; but it isn’t. There are some of the underlying “technical details” that aren’t mentioned but which have a big effect at the higher level.
The CP/M 2.2 Manual says, on page 5-8, “Internally, all files are divided into 16K byte segments called logical extents, so that counters are easily maintained as 8-bit values. The division into extents is discussed in the paragraphs that follow: however, they are not particularly significant for the programmer, …”.
Whilst the statement is true, as “counters are … 8-bit values” and “extents … are not particularly significant”; it does suggest that extents contain 8-bit values and that programmers needn’t worry about extents. Both of those conclusions are wrong.
One-byte Block Numbers
In CP/M 1, a File Control Block (FCB) was a 32 byte copy in memory of 32 bytes in the disk directory. Block numbers were one-byte (8-bit) numbers and the FCB map contained up to 16 block numbers, with each block being 1K bytes in size. That’s where the 16K byte segments come from, CP/M 1. Everything in a CP/M 1 FCB is either text (the filename) or an 8-bit value. Every extent is 16K bytes long.
CP/M 2.X supports larger disks. That is one of its features. You’ll see more about that in section 6.10 “Disk Parameter Tables” of the manual. If you use a disk larger than 256K bytes, you have to use 2K blocks. You can’t get block numbers larger than 255 into a one-byte (8-bit) value so they increased the size of the block. That worked for disk sizes up to 512 KB; but it also introduced a problem.
The problem is “files are divided into 16K byte segments called logical extents” yet we now have 16 slots and 2K blocks. Is an extent 16K or is it 32K?
There seems two answers and vendors of CP/M-compatible operating systems could have chosen either, with half being right and half being wrong. However, there is a third answer that makes all of them wrong. Digital Research, the manufacturer of CP/M, chose the third answer. What’s the third answer? Both.
According to Digital Research, an extent is 16KB. So, in memory, the extent counter “ex” ticks over every 16K. But, on disk, each FCB contains 32KB and there are gaps in “ex” for a file, on disk. You see things for ex like “01”, “03”, “05”, etc. Here’s some real examples:
64k CP/M version 2.2L A>c: C>dir NO FILE C>r 104kb.txt READ V-2.23 (05-Jul-18) Z80SIM Interface V1 Read from "104KB.TXT" and write to "104KB.TXT". 103.625kB written. C>dir C: 104KB TXT C> (Ctrl-F10 to exit simulator) C:\Test\run-cpm22L>cpmfs drivec.dsk dir2 Boot Sector 3E 01 D3 40 21 00 E4 DD 21 EE 00 DD 36 00 02 11 >..@!...!...6... 7F 31 01 0F 02 7B D3 04 CB 4B 28 15 7A D3 34 79 .1...{...K(.z.4y D3 30 97 3D 20 FD DB 34 1F 30 FB DB 30 E6 98 20 .0.= ..4.0..0.. CF 78 D3 32 7A F6 80 D3 34 0E 33 3E 9C D3 30 DB .x.2z...4.3>..0. 34 1F 38 04 ED A2 18 F7 DB 30 CB 67 28 B2 DD 35 4.8......0.g(..5 00 CA 00 FA 3A FC 00 FE 44 20 02 CB F2 06 01 3A ....:...D .....: FA 00 FE 44 20 04 7B EE 02 5F 0E 5F 18 A7 FF FF ...D .{.._._.... FF FF FF FF FF FF FF FF 4C 47 44 53 53 44 E5 E5 ........LGDSSD.. Directory 00 31 30 34 4B 42 20 20 20 54 58 54 01 00 00 80 .104KB TXT.... 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 03 00 00 80 .104KB TXT.... 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 .............. ! 00 31 30 34 4B 42 20 20 20 54 58 54 05 00 00 80 .104KB TXT.... 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 "#$%&'()*+,-./01 00 31 30 34 4B 42 20 20 20 54 58 54 06 00 00 3D .104KB TXT...= 32 33 34 35 00 00 00 00 00 00 00 00 00 00 00 00 2345............ C:\Test\run-cpm22L>
This is a large (8″ / LG) DSSD disk. It uses 2KB blocks and one-byte block numbers.
The “ex” value is just after the file type (“TXT”). You see 01, 03, 05 and 06. Under CP/M 1 you’d see 00, 01, 02, 03, 04, 05, 06.
You can see 16 block numbers in most of the extents (eg for ex=01, 02 03 04 … 10 11).
An extent, on disk, holds 32K bytes. They are numbered 01, 03, 05, etc.
Here’s the same file on a CDOS 2K one-byte block number disk:
CDOS version 02.58 Cromemco Disk Operating System Copyright (C) 1977, 1983 Cromemco, Inc. A.b: B.dir *** 0 Files, 0 Entries, 0 K Displayed, 490 K Left *** B.r 104kb.txt READ V-2.23 (05-Jul-18) Z80SIM Interface V1 Read from "104KB.TXT" and write to "104KB.TXT". 103.625kB written. B.dir 104KB TXT 104K *** 1 Files, 7 Entries, 104 K Displayed, 386 K Left *** B. (Ctrl-F10 to exit the simulator) C:\Test\run-cdos0258-3>cpmfs driveb.dsk dir2 Boot Sector E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 ................ E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 ................ E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 ................ E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 ................ E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 ................ E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 ................ E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 E5 ................ E5 E5 E5 E5 E5 E5 E5 E5 4C 47 44 53 53 44 E5 E5 ........LGDSSD.. CDOS disk label: Userdisk Date on disk : 2000-00-00 Cluster size : 2K Directory type : Normal Directory size : 128 entries Directory 81 55 73 65 72 64 69 73 6B 00 00 00 10 00 00 20 .Userdisk...... 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 00 00 00 80 .104KB TXT.... 02 03 04 05 06 07 08 09 00 00 00 00 00 00 00 00 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 01 00 00 80 .104KB TXT.... 0A 0B 0C 0D 0E 0F 10 11 00 00 00 00 00 00 00 00 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 02 00 00 80 .104KB TXT.... 12 13 14 15 16 17 18 19 00 00 00 00 00 00 00 00 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 03 00 00 80 .104KB TXT.... 1A 1B 1C 1D 1E 1F 20 21 00 00 00 00 00 00 00 00 ...... !........ 00 31 30 34 4B 42 20 20 20 54 58 54 04 00 00 80 .104KB TXT.... 22 23 24 25 26 27 28 29 00 00 00 00 00 00 00 00 "#$%&'()........ 00 31 30 34 4B 42 20 20 20 54 58 54 05 00 00 80 .104KB TXT.... 2A 2B 2C 2D 2E 2F 30 31 00 00 00 00 00 00 00 00 *+,-./01........ 00 31 30 34 4B 42 20 20 20 54 58 54 06 00 00 3D .104KB TXT...= 32 33 34 35 00 00 00 00 00 00 00 00 00 00 00 00 2345............ C:\Test\run-cdos0258-3>
It is the same disk type (LGDSSD). This one has a CDOS label to allow it to access the extra space and the label matches what I said for the CP/M one before this: 2K blocks and one-byte block numbers.
This time there are 7 extents on the disk. These are all labelled sequentially: 00, 01, 02, 03, 04, 05 and 06.
There are one-byte block numbers; but there are only 8 in each of the 16K extents. eg (for ex=01, 0A 0B 0C 0D 0E 0F 10 11). This is the last half of the CP/M ex=01 disk FCB for the same file. The CP/M disk FCB for ex 01 contains the ex 00 half at the front, and the ex 01 part at the end.
CDOS contains a mechanism that allows us to look more directly at the directory. In case you’re thinking perhaps cpmfs is displaying it wrong, here’s what CDOS has to say for itself:
B.dir 104KB TXT 104K *** 1 Files, 7 Entries, 104 K Displayed, 386 K Left *** B.debug sys.dir DEBUG version 00.20 NEXT = 1100 NEXTM = 1100 -d100 0100 81 55 73 65 72 64 69 73 6B 00 00 00 10 00 00 20 .Userdisk...... 0110 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0120 00 31 30 34 4B 42 20 20 20 54 58 54 00 00 00 80 .104KB TXT.... 0130 02 03 04 05 06 07 08 09 00 00 00 00 00 00 00 00 ................ 0140 00 31 30 34 4B 42 20 20 20 54 58 54 01 00 00 80 .104KB TXT.... 0150 0A 0B 0C 0D 0E 0F 10 11 00 00 00 00 00 00 00 00 ................ 0160 00 31 30 34 4B 42 20 20 20 54 58 54 02 00 00 80 .104KB TXT.... 0170 12 13 14 15 16 17 18 19 00 00 00 00 00 00 00 00 ................ -d 0180 00 31 30 34 4B 42 20 20 20 54 58 54 03 00 00 80 .104KB TXT.... 0190 1A 1B 1C 1D 1E 1F 20 21 00 00 00 00 00 00 00 00 ...... !........ 01A0 00 31 30 34 4B 42 20 20 20 54 58 54 04 00 00 80 .104KB TXT.... 01B0 22 23 24 25 26 27 28 29 00 00 00 00 00 00 00 00 "#$%&'()........ 01C0 00 31 30 34 4B 42 20 20 20 54 58 54 05 00 00 80 .104KB TXT.... 01D0 2A 2B 2C 2D 2E 2F 30 31 00 00 00 00 00 00 00 00 *+,-./01........ 01E0 00 31 30 34 4B 42 20 20 20 54 58 54 06 00 00 3D .104KB TXT...= 01F0 32 33 34 35 00 00 00 00 00 00 00 00 00 00 00 00 2345............ -
Despite SYS.DIR not being present as a file on any disk, you can load it successfully into memory with DEBUG.COM (the CDOS equivalent of ZSID.COM). You can also do edits and write the non-existent “file” back over the directory so it’s not generally mentioned to users.
You can see the disk label and the same items in the on-disk FCBs. It’s not just cpmfs.
It’s a little harder to do for the CP/M disk (image) but you can use a hex editor or viewer to see what’s in the directory. It’s a match to what cpmfs is saying.
If you have a disk image from the era, that was created on a system from another manufacturer, there’s a chance you’ll also see 16x2K blocks in an on-disk FCB and with the “ex” values being sequential. That implies they guessed that Digital Research would come out with 32K byte extents for larger disks. (They would have also had to cope with the “rc” field holding a record count of 0-100H, so probably they’d use the “s2” byte as a high-order rc value (guess only).
Digital Research did their high-order rc value by using the low bit/s of the “ex” value. The LHS of ex is the equivalent 32K byte extent number (01 03 05 06 becomes 00 01 02 03 – a nice sequence) and the RHS bit/s go to the front of the “rc” byte (sort of). You get an “rc” of 180H 180H 180H 03DH from the CP/M ex&exm,rc bytes. 180H should be read as 1x80H+80H and 03DH as 0x80H+3DH. There are 100H records in each of the first three on-disk extents and 03DH records in the last one.
Does it Matter?
Here’s what happens if you take the two disks (CDOS 2K 8×1 and CP/M 2K 16×1) and swap them:
B.type 104kb.txt File not found B.dir 104KB TXT 104K *** 1 Files, 4 Entries, 104 K Displayed, 200 K Left *** B.
In CDOS, the CP/M disk doesn’t have a disk label so it thinks the blocks are 1K in size (run CDOS STAT to see this). The “200 K Left” statement is 254 blocks (maximum with 1-byte blocks in CDOS) – the 52 blocks used by “104KB.TXT” (which required 52 blocks of, then, 2KB) – 2 directory blocks. There are 200 unused blocks and CDOS thinks they are 1KB each.
The file shows up in the DIR listing because it has extents in the directory. However, you can’t TYPE it because extent 0 (the start) is missing. It’s a mess.
Here’s the other one:
C>dir C: 104KB TXT : 104KB TXT C>type 104kb.txt 0001 The quick brown fox jumped over the lazy dogs. 0002 The quick brown fox jumped over the lazy dogs. ... 0306 The quick brown fox jumped over the lazy dogs. 0307 The quick brown fox jumped over the lazy dogs. 0308 The quick brown fox jumped over the lazy dogs. 0309 The quick brown fox jumped over the lazy dogs. 0310 Th C>
CP/M has 2K blocks in its DPB in its BIOS so it treats the blocks as 2K.
It successfully types all of the text up to block 09 which is the last block in the 8×1 byte CDOS FCB.
Then it stops because it doesn’t think there are any more records (ex=00 and rc=80H so there’s 128 records not a full 256 records in this on-disk extent).
The one that stands out though, is the two entries in the DIR for the same file. If it were a CP/M 2K 1-byte FCB, it would only have ex=01, 03, 05, … or an even ex number at the end. DIR sees the ex=00 and thinks that is the file so it lists it. Then it sees ex=01 and that’s a valid first extent too so it lists that too. They just happen, in this case, to be for the same file. It’s a mess.
Clearly, it does matter – a lot. It is hardly surprising that we had to resort to serial lines and null modems to copy files between computers rather just putting a disk from one, in the other.
Knowing that the extent numbers (ex) on disk aren’t necessarily sequential any more would have helped (and “been particularly significant for the programmer” if they were writing a directory lister or a disk utility program).
Two-Byte Block Numbers
2K byte blocks and one-byte block numbers will only get you so far (512 KB). For disks larger than that, CP/M allows you to use 4K blocks thus deferring the problem to 1024 KB disks, or to use two-byte block numbers instead.
In most cases, two-byte block numbers are a much better solution because they allow better granularity and a much higher limit on disk space. With 2K blocks and two-byte block numbers, the upper limit goes to 128 MB per floppy disk. The map area of the FCB goes from 16x 1-byte values pointing to 1KB each, to 8x 2-byte values pointing to 2KB each. File extents (on disk or in memory) refer to “16K byte segments” again and all of the counter values go back to what they were. It’s a lot neater.
Unfortunately, for all of CP/M’s flexibility, you can’t tell it to use two-byte block numbers. It’s a choice it makes for itself.
If the disk size is more than 256 blocks, CP/M will switch to two-byte block numbers. It will only do so if the block size is at least 2K. This means it will never use two byte block numbers on disks smaller than 512 KB. If you cheat and tell it the disk is bigger, you are likely to get “WRITE ERROR” or “SEEK ERROR” messages instead of “DISK FULL” ones.
So:
- CP/M Ver 1 FCBs (up to 256 one-byte blocks of 1KB each) for disks up to 256 KB.
- CP/M one-byte 2K, odd extents on disk FCBs for 256KB < disk size <= 512KB.
- CP/M two-byte 2K FCBs for 512KB < disk size <= 128 MB.
CDOS can also use two byte block numbers and its behaviour seems to exactly match CP/M for these ones.
Two-byte block number FCBs look like this:
00 31 30 34 4B 42 20 20 20 54 58 54 00 00 00 80 .104KB TXT.... 04 00 05 00 06 00 07 00 08 00 09 00 0A 00 0B 00 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 01 00 00 80 .104KB TXT.... 0C 00 0D 00 0E 00 0F 00 10 00 11 00 12 00 13 00 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 02 00 00 80 .104KB TXT.... 14 00 15 00 16 00 17 00 18 00 19 00 1A 00 1B 00 ................ 00 31 30 34 4B 42 20 20 20 54 58 54 03 00 00 80 .104KB TXT.... 1C 00 1D 00 1E 00 1F 00 20 00 21 00 22 00 23 00 ........ .!.".#. 00 31 30 34 4B 42 20 20 20 54 58 54 04 00 00 80 .104KB TXT.... 24 00 25 00 26 00 27 00 28 00 29 00 2A 00 2B 00 $.%.&.'.(.).*.+. 00 31 30 34 4B 42 20 20 20 54 58 54 05 00 00 80 .104KB TXT.... 2C 00 2D 00 2E 00 2F 00 30 00 31 00 32 00 33 00 ,.-.../.0.1.2.3. 00 31 30 34 4B 42 20 20 20 54 58 54 06 00 00 3D .104KB TXT...= 34 00 35 00 36 00 37 00 00 00 00 00 00 00 00 00 4.5.6.7.........
You can see “ex” going from 00 to 06. “rc” is 80 for each full extent. You can even see the two-byte block numbers in the map area (eg for ex=00: 04 00, 05 00, 06 00, …, 0B 00).
Summary
There are at least four different FCB types: the original (CP/M 1) type, the CP/M 2 2K one-byte style, the CDOS 2K one-byte style, and a two-byte version.
More Information
This is part of the CP/M topic.